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PHOSPHOLIPASES, NUCLEIC ACIDS ENCODING THEM AND 
METHODS FOR MAKING AND USING THEM 

FIELD OF THE INVENTION 
This invention relates generally to phospholipase enzymes, polynucleotides 
encoding the enzymes, methods of making and using these polynucleotides and polypeptides. 
In particular, the invention provides novel polypeptides having phospholipase activity, 
nucleic acids encoding them and antibodies that bind to them. Industrial methods and 
products comprising use of these phospholipases are also provided. 

BACKGROUND 

Phospholipases are enzymes that hydrolyze the ester bonds of phospholipids. 
Corresponding to their importance in the metabolism of phospholipids, these enzymes are 
widespread among prokaryotes and eukaryotes. The phospholipases affect the metabolism, 
construction and reorganization of biological membranes and are involved in signal cascades. 
Several types of phosphoUpases are known which differ in their specificity according to the 
position of the bond attacked in the phospholipid molecule. Phospholipase Al (PLAl) 
removes the 1 -position fatty acid to produce free fatty acid and l-lyso-2-acylphospholipid. 
PhosphoUpase A2 (PLA2) removes the 2--position fatty acid to produce free fatty acid and 1- 
acyl-2-lysophospholipid. PLAl and PLA2 enzymes can be intra- or extra-cellular, 
membrane-boimd or soluble. Intracellular PLA2 is found in almost every manmialian ceU. 
Phospholipase C (PLC) removes the phosphate moiety to produce 1,2 diacylglycerol and 
phospho base. PhosphoUpase D (PLD) produces 1,2-diacylglycerophosphate and base group. 
PLC and PLD are important in cell function and signaling. PLD had been the dominant 
phoq)holipase in biocatalysis (see, e.g., Godfrey, T. and West S. (1996) Industrial 
enzymology, 299-300, Stockton Press, New York). Patatins are another type of 
phospholipase, thought to work as a PLA (see for example, Hirschberg HJ, et al., (2001), Eur 
J Biochem 268(19):5037-44). 

Common oilseeds, such as soybeans, rapeseed, simflower seeds, sesame and 
peanuts are used as sources of oils and feedstock. In the oil extraction process, the seeds are 
mechanically and thermally treated. The oil is separated and divided from the meal by a 
solvent. Using distiQation, the solvent is then separated from the oil and recovered. The oil is 
"deguromed" and rejSned. The solvent content in the meal can be ev^orated by thermal 
treatment in a "desolventizer toaster," followed by meal drying and cooling. After a solvent 
had been separated by distillation, the produced raw oil is processed into edible oil, using 
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special degumming procedures and physical refining. It can also be utilized as feedstock for 
the production of fatty acids and methyl ester. The meal can be used for animal rations. 

Degumming is the first step in vegetable oil refining and it is designed to 
remove contaminating phosphatides that are extracted with the oil but interfere with the 
subsequent oil processing. These phosphatides are soluble in the vegetable oil only in an 
anhydrous form and can be precipitated and removed if they are simply hydrated. Hydration 
is usually accompUshed by mixing a small proportion of water continuously with 
substantially dry oil. Typically, the amount of water is 75% of the phosphatides content, 
which is typically 1 to 1.5 %. The temperature is not highly critical, although separation of 
the hydrated gums is better if the viscosity of the oil is reduced at 50°C to 80°C. 

Many methods for oil degumming are currently used. The process of oil 
degummmg can be enzymatically assisted by using phospholipase enzymes. PhosphoUpases 
Al and A2 have been used for oil deguimning in various commercial processes, e.g., 
"ENZYMAX™ degumming" (Lurgi Life Science Technologies GmbH, Germany). 
PhosphoUpase C (PLC) also has been considered for oil degumming because the phosphate 
moiety generated by its action on phosphoUpids is very water soluble and easy to remove and 
the diglyceride would stay with the oil and reduce losses; see e.g., Godfirey, T. and West S. 
(1996) Industrial Enzymology, pp.299-300, Stockton Press, New York; Dahlke (1998) "An 
enzymatic process for the physical refining of seed oils," Chem. Eng. Technol. 21 -.278-281; 
Clausen (2001) "Enzymatic oil degumming by a novel microbial phosphoUpase," Bur. J. 
Lipid Sci. Technol. 103:333-340. 

High phosphatide oils such as soy, canola and sunflower are processed 
differently than other oils such as pahn. Unlike the steam or "physical refining" process for 
low phosphatide oils, these high phosphorous oils require special chemical and mechanical 
treatments to remove the phosphorous-containing phosphoUpids. These oils are typically 
refined chemically in a process that entails neutralizmg the firee fatty acids to form so^ and 
an insoluble gum fraction. The neutralization process is highly effective m removing &ce 
fatty acids and phosphohpids but this process also results in significant yield losses and 
sacrifices in quaUty. In some cases, the high phosphatide crude oU is degummed in a step 
preceding caustic neutralization. This is the case for soy oil utiUzed for lecithin wherein the 
oil is first water or acid degummed. 

Phytosterols (plant sterols) are members of the "triterpene" family of natural 
products, which mcludes more than 100 different phytosterols and more than 4000 other 

types of triterpenes. hi general, phytosterols are thought to stabihze plant membranes, with 
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an incarease in the sterol/phospholipid ration leading to membrane rigjdification. ChemicaUy, 
phytosterols closely resemble cholesterol in structure. The major phytosterols are p- 
sitosterol, campesterol and stigmasterol. Others include stigmastanol (P-sitostanol), 
sitostanol, desmosterol, chahnasterol, poriferasterol, cUonasterol and brassicasterol. 

Plant sterols are important agricultural products for health and nutritional 
industries. They are usefiil emulsifiers for cosmetic manufacturers and supply the majority of 
steroidal intennediates and precursors for the production of hormone pharmaceuticals. The 
saturated analogs of phytosterols and their esters have been suggested as effective 
cholesterol-lowering agents with cardiologic health benefits. Plant sterols reduce serum 
cholesterol levels by inhibiting cholesterol absorption in the intestinal lumen and have 
mununomodulatmg properties at extremely low concentrations, including enhanced cellular 
response of T lymphocytes and cytotoxic abiUty of natural killer cells against a cancer cell 
line. In addition, their therapeutic effect has been demonstrated in clinical studies for 
treatment of puhnonary tuberculosis, rheumatoid arthritis, management of HIV-infested 
patients and inhibition of immune stress in marathon runners. 

Plant sterol esters, also referred to as phytosterol esters, were approved as 
GRAS (Generally Recognized As Safe) by the US Food and Drug Administration (EDA) for 
use in margarines and spreads in 1999. In September 2000, the FDA also issued an mterim 
rule that aUows health-claims labeling of foods containing phytosterol ester. Consequently 
enrichment of foods with phytosterol esters is highly desired for consumer acceptance. 

SUMMARY OF THE INVENTION 
The invention provides isolated or recombinant nucleic acids comprising a 
nucleic acid sequence having at least about 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%. 
58%. 59%, 60%, 61%. 62%, 63%, 64%. 65%. 66%. 67%, 68%, 69%, 70%, 71%, 72%, 73%, 
74%'. 75%, 76%, 77%, 78%, 79%. 80%. 81%. 82%. 83%, 84%, 85%, 86%, 87%, 88%, 89%, 
90%. 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or more, or complete (100%) 
sequence identity to an exemplary nucleic acid of the invention. e.g.. SEQ ID NO:l, SEQ ID 
NO:3, SEQ ID NO:5, SEQ ID NO:7, SEQ ID NO:9, SEQ ID NO:ll, SEQ ID NO:13, SEQ 
ID NO:15, SEQ ID NO:17, SEQ ID NO:19. SEQ ID NO:21, SEQ ID NO:23, SEQ ID NO:25, 
SEQ ID NO:27, SEQ ID NO:29, SEQ ID NO:31. SEQ ID NO:33, SEQ ID NO:35. SEQ ID 
NO:37, SEQ ID NO:39, SEQ ID NO:41, SEQ ID NO:43, SEQ ID NO:45, SEQ ID NO:47, 
SEQ ID NO:49, SEQ ID NO:51, SEQ ID NO:53, SEQ ID NO:55. SEQ ID NO:57, SEQ ID 
NO:59, SEQ ID NO:61, SEQ ID NO:63, SEQ ED NO:65, SEQ ID NO:67, SEQ ID NO:69, 
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SEQ ID NO:71, SEQ ID NO:73. SEQ ID NO:75, SEQ ID NO:77, SEQ ID NO:79, SEQ ID 
NO:81, SEQ ID NO:83, SEQ ID NO:85, SEQ ID NO:87, SEQ ID NO:89, SEQ ID NO:91, 
SEQ ID NO:93, SEQ ID NO:95, SEQ ID NO:97, SEQ ID NO:99, SEQ ID NO:101. SEQ ID 
NO:103, SEQ ID NO:105 over aregjon of at least about 10, 15, 20, 25, 30, 35, 40, 45, 50, 75, 
100, 150, 200, 250. 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 
1000, 1050, 1100, 1150, 1200, 1250, 1300, 1350, 1400, 1450, 1500, 1550, 1600, 1650, 1700, 
1750, 1800, 1850, 1900, 1950, 2000, 2050, 2100, 2200, 2250, 2300, 2350, 2400, 2450, 2500, 
or more residues, encodes at least one polypeptide having a phospholipase, e.g., a 
phosphoUpase A, C or D activity, and the sequence identities are determined by analysis with 
a sequence comparison algorithm or by a visual inspection. 

The invention provides isolated or recombinant nucleic acids comprising a 
nucleic acid sequence having at least 81%, 82%. 83%, 84%, 85%, 86%, 87%, 88%, 89%, 
90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or more, or complete (100%) 
sequence identity to SEQ ID N0:1 over a region of at least about 10, 15, 20, 25, 30. 35, 40, 
45, 50, 75, 100, 150, 200, 250, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850 
more consecutive residues, wherein the nucleic acids encode at least one polypeptide having 
a phospholipase, e.g., a phosphoUpase A, B, C or D activity and the sequence identities are 
determined by analysis with a sequence comparison algorithm or by a visual inspection. 

The invention provides isolated or recombinant nucleic acids comprising a 
nucleic acid sequence having at least 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 
87%, 88%, 89%, 90%, 91%, 92%. 93%, 94%, 95%, 96%, 97%, 98%, 99%, or more, or 
complete (100%) sequence identity to SEQ ID NO:3 over a region of at least about 10, 15, 
20, 25, 30, 35, 40, 45, 50, 75, 100, 150. 200. 250, 300, 350, 400, 450. 500, 550, 600, 650. 
700, 750, 800, 850 more consecutive residues, wherein the nucleic acids encode at least one 
polypeptide having a phosphoUpase, e.g., a phosphoUpase A, B, C or D activity and the 
sequence identities are determined by analysis with a sequence comparison algorithm or by a 
visual inspection. 

The invention provides isolated or recombinant nucleic acids comprising a 

nucleic acid sequence having at least 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%. 

87%. 88%, 89%. 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%. 99%, or more, or 

complete (100%) sequence identity to SEQ ID NO:5 over a region of at least about 10, 15, 

20, 25. 30. 35. 40, 45, 50, 75, 100, 150, 200, 250, 300, 350, 400, 450, 500, 550, 600, 650, 

700, 750, 800. 850 more consecutive residues, wherein the nucleic acids encode at least one 

polypeptide having a phosphoUpase, e.g., a phosphoUpase A, B, C or D activity and the 
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sequence identities are determined by analysis with a sequence comparison algorithm or by a 
visual iitspection. 

The invention provides isolated or recombinant nucleic adds conqnising a 
nucleic acid sequence having at least 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 
5 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 
75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 
91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or more, or complete (100%) sequence 
identity to SEQIDNO:7 over a region of at least about 10, 15, 20, 25, 30, 35,40, 45, 50, 75, 
100, 150. 200, 250, 300, 350, 400, 450, 500, 550, 600. 650, 700, 750, 800, 850 more 
10 consecutive residues, wherein the nucleic acids encode at least one polypeptide having a 
phosphoUpase, e.g., aphosphoUpase A, B, C or D activity and the sequence identities are 
determined by analysis with a sequence comparison algorithm or by a visual inspection. 

In alternative aspects. His isolated or recombinant nucleic acid encodes a 
polypeptide comprising a sequence as set forth in SEQ ID NO:2, SEQ ID NO:4, SEQ ID 
15 NO:6, or SEQ ID NO:8. In one aspect these polypeptides have a phospholipase, e.g., a 
phosphoUpase A, B, C or D activity. 

In one aspect, the sequence comparison algorithm is a BLAST algorithm, such 
as a BLAST version 2.2.2 algorithm, hi one aspect, the filtering settmg is set to blastall -p 
blastp -d "nr pataa" -F F and all other options are set to default. 
20 In one aspect, the phosphoUpase activity comprises catalyzing hydrolysis of a 

glycerolphosphate ester linkage (i.e., cleavage of glycerolphosphate ester linkages). The 
phospholipase activity can comprise catalyzing hydrolysis of an ester Unkage in a 
phosphoUpid m a vegetable oil. The vegetable oil phosphoUpid can comprise an oilseed 
phosphoUpid. The phosphoUpase activity can comprise aphosphoUpase C (PLC) activity, a 
25 phosphoUpase A (PLA) activity, such as a phosphoUpase Al or phosphoUpase A2 activity, a 
phosphoUpase D (PLD) activity, such as a phosphoUpase Dl or a phosphoUpase D2 activity, 
or patatin activity. The phosphoUpase activity can comprise hydrolysis of a glycoprotein, 
e.g., as a glycoprotein found in a potato tuber. The phosphoUpase activity can comprise a 
patatin enzymatic activity. The phosphoUpase activity can comprise a Upid acyl hydrolase 

30 (LAH) activity. 

In one aspect, the isolated or recombinant nucleic acid encodes a polypeptide 
having a phosphoUpase activity which is thermostable. The polypeptide can retain a 
phosphoUpase activity under conditions comprising a temperature range of between about 
3TC to about 95"C; between about 55°C to about 85'»C, between about 70°C to about 95''C, 
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or, between about 90»C to about 95°C. In another aspect, the isolated or recombinant nucleic 
acid encodes a polypeptide having a phosphoUpase activity which is thetmotolerant. The 
polypeptide can retain a phosphoUpase activity after exposure to a temperature in the range 
from greater than 37»C to about 95°C or anywhere in the range from greater than 55'C to 
about 85»C. In one aspect, the polypeptide retains a phosphoUpase activity after exposure to 
a temperature in the range from greater than 90">C to about 95°C at pH 4.5. 

The polypeptide can retain a phosphoUpase activity under conditions 
comprising about pH 7, pH 6.5. pH 6.0, pH 5.5, pH 5. or pH 4.5. The polypeptide can retain 
a phosphoUpase activity under conditions comprising a temperature range of between about 

40°C to about TO^C. 

In one aspect, the isolated or recombinant nucleic acid comprises a sequence 
that hybridizes under stringent conditions to a sequence as set forth in SEQ ID NO:l. SEQ ID 
NO:3, SEQ ID NO:5, or SEQ ID NO:7, wherein the nucleic acid encodes a polypeptide 
having a phosphoUpase activity. The nucleic acid can at least about 10, 20, 30, 40, 50, 60, 
70, 80, 90, 100, 150, 200, 250, 300, 350, 400, 450, 500. 550. 600, 650, 700, 750, 800, 850 or 
residues m length or the fiiU length of the gene or transcript, with or without a signal 
sequence, as described herein. The stringent conditions can be highly stringent, moderately 
stringent or of low stringency, as described herein. The stringent conditions can include a 
wash step, e.g., a wash step comprising a wash in 0.2X SSC at a temperature of about 65°C 

for about 15 minutes. 

The invention provides a nucleic acid probe for identifying a nucleic acid 
encoding a polypeptide with a phosphoUpase, e.g., a phosphoUpase, activity, wherein the 
probe comprises at least 10. 20, 30, 40. 50. 60. 70. 80, 90. 100. 150. 200. 250, 300, 350, 400, 
450, 500, 550. 600, 650. 700, 750. 800. 850, or more, consecutive bases of a sequence of the 
inv^tion, e.g., a sequence as set forth in SEQ ID NO:l, SEQ ID NO:3. SEQ ID NO:5. or 
SEQ ID NO:7, and the probe identifies the nucleic acid by binding or hybridization. The 
probe can comprise an oUgonucleotide comprising at least about 10 to 50, about 20 to 60, 
about 30 to 70, about 40 to 80, or about 60 to 100 consecutive bases of a sequence as set forth 
in SEQ ID NO:l, SEQ ID NO:3, SEQ ID NO:5 and/or SEQ ID N0:7. 

The invention provides a nucleic acid probe for identifying a nucleic acid 
encoding a polypeptide with a phosphoUpase, e.g., a phosphoUpase activity, wherem the 
probe comprises a nucleic acid of the invention, e.g., a nucleic acid having at least 50%, 5 1%, 
52%. 53%. 54%, 55%, 56%. 57%. 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 

68%' 69%' 70%. 71%, 72%, 73%. 74%, 75%, 76%. 77%. 78%. 79%. 80%. 81%, 82%, 83%, 
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840/0, 85%, 86%. 870/0, 88o/„, 89-/0, 90o/o, 91o/o. 92o/o, 93o/o. 94o/o, 95o/o, 96o/o, 97o/o, 98o/o. 99o/o. 
or more, or complete (100%) sequence identity to SEQ ID NO:l. SEQ ID NO:3, SEQ ID 
NO:5 and/or SEQ ID NO:7, or a subsequence thereof; over a region of at least about 10, 20, 
30, 40, 50, 60, 70, 80, 90, 100, 150, 200, 250, 300, 350, 400, 450, 500, 550. 600, 650, 700, 
750, 800, 850 or more consecutive residues, wherein the sequence identities are detennined 
by analysis with a sequence comparison algorithm or by visual inspection. 

The invention provides an ampUfication primer sequence pair for amplifying a 
nucleic acid encoding a polypeptide having a phosphoUpase activity, wherein the primer pair 
is enable of anq)Ufjang a nucleic acid comprising a sequence of the invention, or fragments 
or subsequences thereof. One or each member of the amplification primer sequence pair can 
comprise an oUgonucleotide comprising at least about 10 to 50 consecutive bases of the 
sequence, or about 12. 13. 14. 15, 16, 17, 18, 19. 20. 21, 22, 23, 24. or 25 consecutive bases 
of the sequence. 

The invention provides amplification primer pairs, wherein the primer pair 
comprisesafirst member havingasequence as set forth by about the first (the 5') 12, 13, 14, 
15, 16, 17, 18, 19, 20, 21, 22, 23, 24, or 25 residues of anucleic acid of the invention, and a 
second member having a sequence as set forth by about the first (the 5') 12, 13, 14. 15. 16. 
17, 18, 19, 20, 21, 22, 23, 24, or 25 residues of the complementary strand of the first member. 

The invention provides phosphoUpases generated by ampUfication, e.g., 
polymerase chain reaction (PGR), using an ampUfication primer pair of the invention. The 
invention provides methods of making a phosphoUpase by amplification, e.g., polymerase 
chain reaction (PGR), using an ampUfication primer pair of the invention. In one aspect, the 
ampUfication primer pair ampUfies a nucleic acid from a Ubrary, e.g.. a gene Ubrary. such as 

an environmental Ubrary. 

The invention provides methods of ampUfying a nucleic acid encoding a 
polypeptide having a phosphoUpase activity comprising ampUfication of a template nucleic 
acid with an ampUfication primer sequence pair capable of ampUfying a nucleic acid 
sequence of the invention, or fragments or subsequences thereof. The ampUfication primer 
pair can be an ampUfication primer pair of the invention. 

The invention provides expression cassettes comprising a nucleic acid of the 
invention or a subsequence thereof In one aspect, the expression cassette can comprise the 
nucleic acid that is operably Unked to a promoter. The promoter can be a viral, bacterial, 
mammaUan or plant promoter. In one aspect, tiie plant promoter can be a potato, rice, com, 
wheat, tobacco or barley promoter. The promoter can be a constitiitive promoter. The 
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constitutive promoter can comprise CaMV35S. In another aspect, the promoter can be an 
inducible promoter. In one aspect, liie promoter can be a tissue-specific promoter or an 
environmentaUy regulated or a developmentally regulated promoter. Thus, the promoter can 
be, e.g., a seed-specific, a leaf-specific, a root-specific, a stem-specific or an abscission- 
induced promoter. In one aspect, the expression cassette can further comprise a plant or plant 

virus expression vector. 

The invention provides cloning vehicles comprising an expression cassette 
(e.g., a vector) of the invention or a nucleic acid of the invention. The cloning vehicle can be 
a viral vector, a plasmid, a phage, a phagemid, a cosmid, a fosmid, a bacteriophage or an 
artificial chromosome. The viral vector can comprise an adenovirus vector, a retroviral 
vector or an adeno-associated viral vector. The cloning vehicle can comprise a bacterial 
artificial chromosome (BAG), a plasmid, a bacteriophage Pl-derived vector (PAC), a yeast 
artificial chromosome (YAC), or a mammaUan artificial chromosome (MAC). 

The invention provides transformed cell comprising a nucleic acid of the 
invention or an expression cassette (e.g., a vector) of tiie invention, or a cloning vehicle of tiie 
invention. In one aspect, the transformed cell can be a bacterial cell, a mammaUan cell, a 
fimgal cell, a yeast cell, an insect cell or a plant cell. In one aspect, tiie plant cell can be a 
potato, wheat, rice, com, tobacco or barley cell. 

The invention provides transgenic non-human animals comprising a nucleic 
acid of tiie invention or an expression cassette (e.g., a vector) of tiie invention. In one aspect, 

the animal is a mouse. 

The invention provides transgenic plants comprising a nucleic acid of tiie 
invention or an expression cassette (e.g., a vector) of flie invention. The tiransgenic plant can 
be a com plant, a potato plant, a tomato plant, a wheat plant, an oilseed plant, a rapeseed 
plant, a soybean plant, a rice plant, a barley plant or a tobacco plant. The invention provides 
tiransgenic seeds comprising a nucleic add of tiie invention or an expression cassette (e.g., a 
vector) of tiie invention. The tiransgenic seed can be a com seed, a wheat kernel, an oilseed, a 
r^eseed (a canola plant), a soybean seed, a pahn kemel, a sunflower seed, a sesame seed, a 

peanut or a tobacco plant seed. 

The invention provides an antisense oligonucleotide comprising a nucleic acid 
sequence complementary to or capable of hybridizing under stringent conditions to a nucleic 
acid of tiie invention. The invention provides metiiods of inhibiting the tiranslation of a 
phosphoUpase message in a cell comprising administering to tiie cell or expressing in tiie cell 



8 



09010-094001 

wo 03/089620 PCT/US03/12556 

an antisense oUgonucleotide comprising a nucleic acid sequence complementary to or 
capable of hybridizing under stringent conditions to a nucleic acid of the invention. 

The invention provides an antisense oUgonucleotide comprising a nucleic acid 
sequence complementary to or capable of hybridizing under stringent conditions to a nucleic 
acid of the invention. The invention provides methods of inhibiting the translation of a 
phosphoUpase message in a cell comprising administering to the cell or expressing in the cell 
an antisense oUgonucleotide comprising a nucleic acid sequence complementary to or 
capable of hybridizing under stringent conditions to a nucleic acid of the invention. The 
antisense oUgonucleotide canbe between about 10 to 50, about 20 to 60. about 30 to 70, 
about 40 to 80, about 60 to 100, about 70 to 110, or about 80 to 120 bases in length. 

The mvention provides methods of inhibiting the translation of a 
phosphoUpase, e.g., a phosphoUpase. message in a ceU comprismg administering to the ceU 
or expressing in the cell an antisense oUgonucleotide comprising a nucleic acid sequence 
complementary to or capable of hybridizing under stringent conditions to a nucleic acid of the 
invention. The invention provides double-stranded inhibitory RNA (RNAi) molecules 
comprising a subsequence of a sequence of the invention. In one aspect, the RNAi is about 
15, 16, 17, 18, 19, 20, 21, 22, 23. 24, 25 or more duplex nucleotides m length. The invention 
provides methods' of inhibiting the expression of a phosphoUpase, e.g., a phosphoUpase, in a 
ceU comprising administering to the ceU or expressing in the ceU a double-stranded inhibitory 
RNA (iRNA), wherein the RNA comprises a subsequence of a sequence of the mvention. 

' The invention provides an isolated or recombinant polypeptide comprising an 
amino acid sequence having at least about 50%, 51%, 52o/o. 53o/o, 54%, 55o/o, 56«/o, 57%, 
58% 59%, 60%, 61%, 620/0, 63o/o, 64o/o. 65%. 66%, 67%, 68%, 69%. 70%, 71%, 72%, 73%. 
74«/o', 750/0, 760/0, 770/0, 78o/o, 79o/o. 80%. 8I0/0. 82o/o, 83o/o, 840/0, 85o/o. 86o/„, 87%. 880/0. 89o/o, 
90%'. 910/0, 920/0, 930/0. 940/0. 950/0, 960/0. 970/0, 980/0. 990/0, or more, or complete (100%) 
sequence identity to an exemplary polypeptide or peptide of the invention over a region of at 
least about 25. 50. 75, 100, 125. 150, 175. 200. 225, 250. 275, 300, 325, 350 or more 
residues, or over the fuU length of the polypeptide, and the sequence identities are determmed 
by analysis with a sequence comparison algorithm or by a visual inspection. Exemplary 
polypeptide or peptide sequences of the invention include SEQ ID NO:2. SEQ ID NO:4, SEQ 
ID NO-6 or SEQ ID NO:8. hi one aspect, the mvention provides an isolated or recombinant 
polypeptide comprising an amino acid sequence havhxg at least about 81%. 82%, 83o/o, 84o/o. 
850/0. 860/0. 870/0, 880/0, 890/0, 900/0. 910/0, 92%, 93%, 94o/o, 95%, 96o/o, 97%. 98%, 99%, or 
mor^ or complete {100%) sequence identity to SEQ ID N0:2. hi one aspect, the mvention 
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provides an isolated or recombinant polypeptide comprising an amino acid sequence having 
at least about 78%, 79%, 80%, 81%, 82%, 83%. 84%. 85%. 86%, 87%. 88%. 89%. 90%. 
91%, 92%, 93%. 94%, 95%, 96%, 97%, 98%, 99%, or more, or complete (100%) sequence 
identity to SEQ ID NO:4. In one aspect, the invention provides an isolated or recombinant 
polypeptide comprising an amino acid sequence having at least about 78%, 79%, 80%, 81%. 
82%, 83%, 84%, 85%. 86%. 87%, 88%, 89%, 90%, 91%, 92%. 93%, 94%, 95%, 96%, 97%, 
98%, 99%, or more, or complete (100%) sequence identity to SEQ ID NO:6. In one aspect, 
the invention provides an isolated or recombinant polypeptide comprising an amino acid 
sequence having at least about 50%, 51%, 52%, 53o/.. 54%, 55%. 56%. 57%. 58o/o, 59%, 
60%, 61%, 62%, 63%, 64%, 65%. 66%, 67%. 68%, 69%, 70%. 71%, 72%, 73%, 74%. 75%. 
76%. 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%. 
92%. 93%, 94%. 95%, 96%, 97%, 98%. 99%. or more, or complete (100%) sequence identity 
to SEQ ID NO:8. The invention provides isolated or recombinant polypeptides encoded by a 
nucleic acid of the invention, hx alternative aspects, the polypeptide can have a sequence as 
set forth in SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:6, or SEQ ID N0:8. Ihe polypeptide 
can have a phosphoUpase activity, e.g., a phosphoUpase A, B, C or D activity. 

The invention provides isolated or recombinant polypeptides comprising a 
polypeptide of the mvention lacking a signal sequence. In one aspect, the polypeptide 
lacking a signal sequence has at least 81%. 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 
90%, 91%, 92%, 93%, 94%, 95%, 96%. 97%. 98%. 99% or more sequence identity to 
residues 30 to 287 of SEQ ID N0:2. an amino acid sequence having at least 78%, 79%, 80%, 
81%. 82%, 83%, 84%. 85%, 86%. 87%, 88%. 89%. 90%, 91%, 92%, 93%, 94%, 95%, 96%, 
97%, 98%, 99%, or more sequence identity to residues 25 to 283 of SEQ ID NO:4. an amino 
acid 'sequ^ice having at least 78%, 79%, 80%, 81%, 82%, 83%, 84%. 85%. 860/0, 87%, 88%, 
89%, 90%, 91%, 92%. 93%. 94%. 95%, 96%. 97%, 98%, 99%, or more sequence identity to 
residues 26 to 280 of SEQ ID NO:6, or, an amino acid sequence having at least 50%, 51%, 
52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%. 66%. 67%. 
68%', 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%. 83%. 
84%', 85%. 86%, 87%, 88%, 89%. 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, 
or more sequence identity to residues 40 to 330 of SEQ ID NO:8. The sequence identities 
can be detennined by analysis with a sequence comparison algorithm or by visual inspection. 

Another aspect of the invention provides an isolated or recombinant 
polypeptide or peptide including at least 10, 15, 20, 25. 30, 35, 40, 45, 50, 55, 60, 65. 70, 75. 
80, 85, 90, 95 or 100 or more consecutive bases of a polypeptide or peptide sequence of the 
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invention, sequences substantially identical thereto, and the sequaices complementary 
thereto. The peptide can be, e.g., an immvmogenic ftagment, a motif (e.g., a binding site) or 
an active site. 

In one aspect, the isolated or recombinant polypeptide of the invention (with 
or without a signal sequence) has a phospholipase activity. In one aspect, the phospholipase 
activity comprises catalyzing hydrolysis of a glycerolphosphate ester linkage (i.e., cleavage 
of glycerolphosphate ester linkages). The phospholipase activity can comprise catalyzing 
hydrolysis of an ester linkage in a phospholipid in a vegetable oil. The vegetable oil 
phosphoUpid can comprise an oilseed phosphoUpid. The phospholipase activity can comprise 
a phospholipase C (PLC) activity, a phosphoUpase A (PLA) activity, such as a phosphoUpase 
Al or phosphoUpase A2 activity, a phospholipase D (PLD) activity, such as a phosphoUpase 
Dl or a phosphoUpase D2 activity. The phospholipase activity can comprise hydrolysis of a 
glycoprotein, e.g., as a glycoprotein found in a potato tuber. The phosphoUpase activity can 
comprise a patatin aizymatic activity. The phosphoUpase activity can comprise a Upid acyi 

hydrolase (LAH) activity. 

In one aspect, flie phospholipase activity is thermostable. The polypeptide can 
retain a phosphoUpase activity under conditions comprising a temperature range of between 
about 37°C to about 95°C, between about 55°C to about SS'^C, between about 70"C to about 
95°C, or between about 90°C to about 95''C. In another aspect, the phosphoUpase activity can 
be thermotolerant The polypeptide can retain a phosphoUpase activity after exposure to a 
tenq)erature in tiie range from greater tiian 3TC to about 95°C, or in tiie range from greater 
than 55°C to about 85»C. In one aspect, the polypeptide can retain a phospholipase activity 
after exposure to a temperature in the range from greater tiian 90°C to about 95"C at pH 4.5. 

In one aspect, the polypeptide can retain a phosphoUpase activity under 
conditions comprising about pH 6.5, pH 6, pH 5.5, pH 5, pH 4.5 or pH 4. In another aspect, 
the polypeptide can retain a phosphoUpase activity under conditions comprising about pH 7, 
pH 7.5 pH 8.0, pH 8.5, pH 9, pH 9.5, pH 10, pH 10.5 or pH 11. 

In one aspect, the isolated or recombinant polypeptide can comprise the 
polypeptide of the invention that lacks a signal sequence. In one aspect, the isolated or 
recombinant polypeptide can comprise the polypeptide of the invention comprising a 
hetcarologous signal sequence, such as a heterologous phosphoUpase or non-phosphoUpase 
signal sequence. 

The invention provides isolated or recombinant peptides comprising an amino 

acid sequence having at least 95%, 96%, 97%, 98%, 99%, or more sequence identity to 
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residues 1 to 29 of SEQ ID NO:2, at least 95%, 96%. 97%, 98%. 99%, or more sequence 
identity to residues 1 to 24 of SEQ ID NO:4, at least 95%, 96%, 97%, 98%, 99%, or more 
sequence identity to residues 1 to 25 of SEQ ID NO:6, or at least 95o/o, 96%. 97%. 98%. 99o/o. 
or more sequence identity to residues 1 to 39 of SEQ ID NO:8. and to other signal sequences 
as set forth in the SEQ ID Usting, wherein the sequence identities are determined by analysis 
with a sequence comparison algorithm or by visual inspection. These peptides can act as 
signal sequences on its endogenous phosphoUpase. on another phosphoUpase, or a 
heterologous protein (a non-phosphoUpase enzyme or other protein). In one aspect, the 
invention provides chimeric proteins comprising a first domain comprising a signal sequence 
of the invention and at least a second domain. The protein can be a fusion protein. The 
second domain can comprise an enzyme. The enzyme can be a phosphoUpase. 

The invention provides chimeric polypeptides comprismg at least a first 
domain comprising signal peptide (SP) of the invention or a catalytic domain (CD), or active 
site, of a phosphoUpase of the invention and at least a second domain comprising a 
heterologous polypeptide or peptide, wherein the heterologous polypeptide or peptide is not 
naturally associated with the signal peptide (SP) or catalytic domain (CD). In one aspect, the 
heterologous polypeptide or peptide is not a phosphoUpase. The heterologous polypeptide or 
peptide can be amino terminal to, carboxy terminal to or on both ends of the signal peptide 

(SP) or catalytic domain (CD). 

The invention provides isolated or recombinant nucleic acids encoding a 
chimeric polypeptide, wherein the chimeric polypeptide comprises at least a first domain 
comprising signal peptide (SP) or a catalytic domain (CD), or active site, of a polypeptide of 
the invention, and at least a second domain comprising a heterologous polypeptide or peptide, 
wherein the heterologous polypeptide or peptide is not naturally associated wiih the signal 

peptide (SP) or catalytic domain (CD). 

In one aspect, the phosphoUpase activity comprises a specific activity at about 
37°C in the range from about 100 to about 1000 units per miUigram of protein, hi another 
aspect, the phosphoUpase activity comprises a specific activity fit>m about 500 to about 750 
units per miUigram of protein. Alternatively, the phosphoUpase activity comprises a specific 
activity at 37°C in the range from about 500 to about 1200 units per miUigram of protein, hi 
one aspect, the phosphoUpase activity comprises a specific activity at 37-C in the range fi:om 
about 750 to about 1000 units per miUigram of protein. In another aspect, the 
thetmotolerance comprises retention of at least half of the specific activity of the 
phosphoUpase at 37»C after being heated to the elevated temperature. Alternatively, the 
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thermotolerance can comprise retention of specific activity at 3TC in the range ftom about 
500 to about 1200 units per milligram of protein after being heated to Ihe elevated 
temperature. 

The invention provides the isolated or recombinant polypeptide of the 
5 invention, wherein the polypeptide comprises at least one glycosylation site. In one aspect, 
glycosylation can be an N-hnked glycosylation. In one aspect, the polypeptide can be 
glycosylated after being expressed in a P. pastoris or a S. pombe. 

The invention provides protein preparations comprising a polypeptide of the 
invention, wherein the protein preparation comprises a Uquid, a soUd or a gel. 

The invention provides heterodimers comprising a polypeptide of the 
invention and a second protein or domain. The second member of the heterodimer can be a 
different phosphoUpase, a different enzyme or another protein. In one aspect, the second 
domain can be a polypeptide and the heterodimer can be a fusion protein. In one aspect, the 
second domain can be an epitope or a tag. In one aspect, the invention provides homodimers 
15 comprising a polypeptide of the invention. 

The invention provides immobilized polypeptides havmg a phosphoUpase 
activity, wherein the polypeptide comprises a polypeptide of the invention, a polypeptide 
encoded by a nucleic acid of the invention, or a polypeptide comprising a polypeptide of the 
invention and a second domain. In one aspect, the polypeptide can be immobiUzed on a cell, 
20 a metal, a resin, a polymer, a ceramic, a glass, a microelectrode, a graphitic particle, a bead, a 
gel, a plate, an array or a capillary tube. 

The invention provides arrays comprising an immobiUzed polypeptide, 
wherein the polypeptide is a phosphoUpase of the invention or is a polypeptide encoded by a 
nucleic acid of the invention. The invention provides arrays con^rising an immobiUzed 
25 nucleic acid of the mvention. The invention provides an array comprising an immobiUzed 

antibody of the invention. 

The invention provides isolated or recombinant antibodies that specifically 
bind to a polypeptide of the invention or to a polypeptide encoded by a nucleic acid of the 
invention. The antibody can be a monoclonal or a polyclonal antibody. The invention 
30 provides hybridomas comprising an antibody of the invention. 

The invention provides methods of isolating or identifying a polypeptide with 
a phosphoUpase activity comprising the steps of: (a) providing an antibody of the invention; 
(b) providing a sample comprising polypeptides; and, (c) contacting the sample of step (b) 

with the antibody of step (a) under conditions wherein the antibody can specificaUy bind to 
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the polypeptide, thereby isolating or identi^g a phosphoUpase. The invention provides 
methods of making an anti-phosphoUpase antibody comprising administering to a non-hnman 
animal a nucleic acid of the invention, or a polypeptide of the invention, in an amount 
sufficient to generate a humoral unmune response, thereby making an anti-phosphoUpase 
antibody. 

The invention provides methods of producing a recombinant polypeptide 
comprising the steps of: (a) providing a nucleic acid of the invention operably linked to a 
promoter; and. (b) expressiog the nucleic acid of step (a) under conditions that allow 
expression of the polypeptide, thereby producing a recombinant polypeptide. Hie nucleic 
acid can comprise a sequence having at least 85% sequence identity to SEQ ID NO:l over a 
region of at least about 100 residues, having at least 80% sequence identity to SEQ ID NO:3 
over a region of at least about 1 00 residues, having at least 80% sequence identity to SEQ ID 
NO:5 over a region of at least about 100 residues, or having at least 70% sequence identity to 
SEQ ID NO:7 over a region of at least about 100 residues, wherein Ihe sequence identities are 
determined by analysis with a sequence comparison algorithm or by visual inspection. The 
nucleic acid can comprise a nucleic acid that hybridizes under stringent conditions to a 
nucleic acid as set forth in SEQ ID NO:l. or a subsequence thereof; a sequence as set forth in 
SEQ ID NO:3, or a subsequence thereof; a sequence as set forth in SEQ ID NO:5, or a 
subsequence tiiereof, or, a sequence as set forth in SEQ ID NO:7, or a subsequence thereof. 
The method can further comprise transforming a host cell with the nucleic acid of step (a) 
followed by expressing the nucleic acid of step (a), thereby producing a recombinant 
polypeptide in a transformed ceU. The method can further conq)rise inserting into a host non- 
human animal the nucleic acid of step (a) followed by expressing the nucleic acid of step (a), 
thereby producing a recombinant polypeptide in the host non-human animal. 

The invention provides methods for identifying a polypeptide having a 
phosphoUpase activity comprising tiiefollowmg steps: (a) providing a polypeptide of the 
invention or a polypeptide encoded by a nucleic acid of the invention, or a fragment or 
variant tiiereof, (b) providing a phosphoUpase substrate; and, (c) contacting the polypeptide 
or a fragment or variant tiiereof of step (a) witii tiie substiate of step (b) and detecting an 
increase in tiie amount of substrate or a decrease in the amount of reaction product, wherein a 
decrease in tiie amount of tiie substrate or an increase in Ihe amount of tiie reaction product 
detects a polypeptide having a phosphoUpase activity. In alternative aspects, tiie nucleic acid 
comprises a sequence having at least 85% sequence identity to SEQ ID NO: 1 over a region of 

at least about 100 residues, having at least 80% sequence identity to SEQ ID NO:3 over a 
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region of at least about 100 residues, having at least 80% sequence identity to SEQ ID NO:5 
over a region of at least about 1 GO residues, or having at least 70% sequence identity to SEQ 
ID NO:7 over a region of at least about 100 residues, wherein tiie sequence identities are 
determined by analysis with a sequence comparison algorithm or by visual inspection. In 
alternative aspects the nucleic acid hybridizes under stringent conditions a sequence as set 
forth in SEQ ID NO: 1, or a subsequence thereof; a sequence as set forth in SEQ ID NO:3, or 
a subsequence thereof; a sequence as set forth in SEQ ID NO:5, or a subsequence thereof; or, 
a sequence as set forth in SEQ ID NO:7, or a subsequence thereof. 

The invention provides methods for identifying a phospholipase substrate 
comprismg the followmg steps: (a) providingapolypeptideof themventionor a 
polypeptide encoded by a nucleic acid of the invention; (b) providing a test substrate; and. 
(c) contacting the polypeptide of step (a) with the test substrate of step (b) and detecting an 
increase in the amount of substrate, or a decrease in the amount of reaction product, wherein a 
decrease in the amount of the substrate or an increase in the amount of the reaction product 
identifies the test substrate as a phosphohpase substrate, hi alternative aspects, the nucleic 
acid can have at least 85% sequence identity to SEQ ID NO:l over a region of at least about 
100 residues, at least 80% sequence identity to SEQ ID NO:3 over a region of at least about 
100 residues, at least 80% sequence identity to SEQ ID NO:5 over a region of at least about 
100 residues, or. at least 70% sequence identity to SEQ ID NO:7 over a region of at least 
about 100 residues, wherein the sequence identities are determmed by analysis with a 
sequence comparison algorithm or by visual inspection, hi alternative aspects, the nucleic 
acid hybridizes under stringent conditions to a sequence as set forth in SEQ ID NO:l, or a 
subsequence thereof; a sequence as set forth in SEQ ID NO:3. or a subsequence thereof; a 
sequence as set forth m SEQ ID NO:5, or a subsequence thereof; or, a sequence as set forth m 
SEQ ID NO:7, or a subsequence thereof. 

The mvention provides methods of determining whether a compound 
specifically binds to a phospholipase comprismg the following steps: (a) expressing a 
nucleic acid or a vector comprising the nucleic acid under conditions permissive for 
translation of the nucleic acid to a polypeptide, wherein the nucleic acid and vector comprise 
a nucleic acid or vector of the invention; or, providmg a polypeptide of the invention (b) 
contacting the polypeptide witii the test compound; and, (c) determining whether the test 
compound specifically bmds to the polypeptide, thereby determining tiiat the compound 
specifically binds to the phosphohpase. hi alternative aspects, the nucleic acid sequence has 
at least 85% sequence identity to SEQ ID NO:l over aregion of at least about 100 residues. 
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at least 80% sequence identity to SEQ ID NO:3 over a region of at least about 100 residues, 
least 80% sequence identity to SEQ ID N0:5 over a region of at least about 100 residues, or, 
at least 70% sequence identity to SEQ ID NO:7 over a region of at least about 100 residues, 
wherein the sequence identities are determined by analysis with a sequence comparison 
algorithm or by visual inspection. In alternative aspects, the nucleic acid hybridizes under 
stringent conditions to a sequence as set forth in SEQ ID NO:l. or a subsequence thereof; a 
sequence as set forth in SEQ ID NO:3, or a subsequence thereof; a sequence as set forth in 
SEQ ID NO:5, or a subsequence thereof; or, a sequence as set forth in SEQ ID NO:7, or a 

subsequence thereof. 

The invention provides methods for identifying a modulator of a 
phosphoUpase activity comprising the following steps: (a) providing a polypeptide of the 
invention or a polypeptide encoded by a nucleic acid of the invention; (b) providing a test 
compound; (c) contactingthepolypeptideof step (a) with the test compound of step (b); 
and, measuring an activity of the phosphoUpase, wherein a change in the phosphoUpase 
activity measured in the presence of the test compound compared to the activity in the 
absence of the test compound provides a determination that the test compound modulates the 
phosphoUpase activity. In altemative aspects, the nucleic acid can have at least 85% 
sequence identity to SEQ ID NO:l over a region of at least about 100 residues, at least 80% 
sequence identity to SEQ ID NO:3 over a region of at least about 100 residues, at least 80% 
sequence identity to SEQ ID NO:5 over a region of at least about 100 residues, or, at lea^t 
70% sequence identity to SEQ ID NO:7 over a region of at least about 100 residues, wherein 
liie sequence identities are determined by analysis with a sequence comparison algorithm or 
by visual inspection. In altemative aspects, the nucleic acid can hybridize under stringent 
conditions to a nucleic acid sequence selected from the group consisting of a sequence as set 
forth in SEQ ID NO:l, or a subsequence thereof; a sequence as set forth in SEQ ID NO:3, or 
a subsequence thereof; a sequence as set forth in SEQ ID NO:5, or a subsequence thereof; 
and, a sequence as set forth in SEQ ID NO:7, or a subsequence thereof. 

In one aspect, the phosphoUpase activity is measured by providing a 
phosphoUpase substrate and detecting an increase in the amount of the substrate or a decrease 
in the amount of a reaction product. The decrease in the amount of the substrate or the 
increase in the amount of the reaction product with the test compound as compared to the 
amount of substrate or reaction product without the test compound identifies the test 
compound as an activator of phosphoUpase activity. The increase in tiie amount of the 

substrate or the decrease in the amount of the reaction product with the test compound as 
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compared to the amomt of substrate or reaction product without iiie test compound identifies 
the test compound as an inhibitor of phospholipase activity. 

The invention provides computer systems comprising a processor and a data 
storage device wherein said data storage device has stored thereon a polypeptide sequence of 
the invention or a nucleic acid sequence of the invention. 

In one aspect, the computer system can fiirther comprise a sequence 
comparison algorito and a data storage device having at least one reference sequence stored 
thereon. The sequence comparison algorithm can comprise a computer program that 
indicates polymorphisms. The computer system can further comprising an identifier that 
identifies one or more features in said sequence. 

The invention provides computer readable mediums having stored thereon a 
sequence comprising a polypeptide sequence of the invention or a nucleic acid sequence of 
the invention. 

The invention provides methods for identifying a feature in a sequence 
comprising the steps of: (a) reading the sequence using a computer program which identifies 
one or more features in a sequence, wherein the sequence comprises a polypeptide sequence 
of the invention or a nucleic acid sequence of the invention; and, (b) identifying one or more 
features in the sequence with the computer program. 

The invention provides methods for comparing a first sequence to a second 
sequence comprising the steps of: (a) reading the first sequence and the second sequence 
through use of a computer program which compares sequences, wherein the first sequence 
comprises a polypeptide sequence of the invention or a nucleic acid sequence of the 
invention; and. (b) determining differences between the first sequence and flie second 
sequence with the computer program. M one aspect, the step of determining differences 
between the first sequence and Ihe second sequence finrther comprises the step of identifying 
polymorphisms. In one aspect, the method fiirther comprises an identifier (and use of the 
identifier) that identifies one or more features in a sequence. In one aspect, the method 
comprises reading the first sequence using a computer program and identifying one or more 

features in the sequence. 

The invention provides methods for isolating or recovering a nucleic acid 
encoding a polypeptide with a phospholipase activity firom an environmental sample 
comprising the steps of: (a) providing an ampUfication primer sequence pair for amplifying 
a nucleic acid encoding a polypeptide with a phosphoUpase activity, wherein the primer pair 
is capable of ampUfying a nucleic acid of the invention (e.g., SEQ ID NO:l, or a subsequence 
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iiiereof; SEQ ID NO:3, or a subsequence thereof, SEQ ID NO:5, or a subsequence thereof; or 
SEQ ID NO:7. or a subsequence thereof etc.); (b) isolating a nucleic acid from the 
environmental sample or treating the environmental sample such that nucleic acid in the 
sample is accessible for hybridization to the ampUfication primer pair; and, (c) combming 

5 the nucleic acid of step (b) with the ampUfication primer pair of step (a) and ampUfying 
nucleic acid from the environmental sample, thereby isolating or recovering a nucleic acid 
encoding a polypeptide with a phosphoUpase activity from an environmental sample. In one 
aspect, each member of the amplification primer sequence pair comprises an oligonucleotide 
con^rising at least about 10 to 50 consecutive bases of a nucleic acid sequence of the 

10 invention. In one aspect, the amplification primer sequence pair is an amplification pair of 

the invention. 

The invention provides methods for isolating or recovering a nucleic acid 
encoding a polypeptide with a phosphoUpase activity from an environmental sample 
comprising the steps of: (a) providing a polynucleotide probe comprising a nucleic acid 
15 sequence of the invention, or a subsequence thereof; (b) isolating a nucleic acid &om the 
environmental sample or treating the environmental sample such that nucleic acid in the 
sample is accessible for hybridization to a polynucleotide probe of step (a); (c) combining 

the isolated nucleic acid or the treated environmental sample of step (b) witii the 
polynucleotide probe of step (a); and, (d) isolating a nucleic acid that specifically hybridizes 

20 witii the polynucleotide probe of step (a), tiiereby isolating or recovering a nucleic acid 
encoding a polypeptide with a phosphoUpase activity from tiie environmental sample. In 
alternative aspects, the environmental sample comprises a water sample, a Uquid sample, a 
soil sample, an air sample or a biological sample. In alternative aspects, flie biological 
sample is derived from a bacterial cell, a protozoan ceU, an insect ceU, a yeast cell, a plant 

25 cell, a ftmgal cell or a mammalian cell. 

The invention provides methods of generating a variant of a nucleic acid 
encoding a phosphoUpase comprising tiie steps of: (a) providing a template nucleic acid 
comprising a nucleic acid of the invention; (b) modifying, deleting or adding one or more 
nucleotides in tiie template sequence, or a combination tiiereof, to generate a variant of tiie 

30 template nucleic acid. 

In one aspect, tiie metiiod further comprises expressing tiie variant nucleic acid 
to generate a variant phosphoUpase polypeptide. In alternative aspects, tiie modifications, 
additions or deletions are introduced by error-prone PGR, shuffling, oUgonucleotide-directed 

mutagaiesis, assembly PGR, sexual PGR mutagenesis, in vivo mutagenesis, cassette 
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mutagenesis, recursive ensemble mutagenesis, exponential ensemble mutagenesis, site- 
specific mutagenesis, gene reassembly, gene site saturated mutagenesis (GSSM). synthetic 
Ugation reassembly (SLR) and/or a combination thereof. In alternative aspects, the 
modifications, additions or deletions are introduced by a method selected firom the groiq> 
consisting of recombination, recursive sequence recombination, phosphothioate-modified 
DNA mutagenesis, uracil-containing template mutagenesis, gapped duplex mutagenesis, 
point mismatch repair mutagenesis, repair-deficient host strain mutagenesis, chemical 
mutagenesis, radiogenic mutagenesis, deletion mutagenesis, resttiction-selection 
mutagenesis, restriction-purification mutagenesis, artificial gene synthesis, ensemble 
mutagenesis, chimeric nucleic acid multimer creation and/or a combination thereof 

In one aspect, the method is iteratively repeated until a phosphohpase having 
an altered or different activity or an altered or different stabiUty &om that of aphosphoUpase 
encoded by the template nucleic acid is produced. In one aspect, the altered or different 
activity is a phospholipase activity under an acidic condition, wherein the phosphohpase 
encoded by the template nucleic acid is not active under the acidic condition. In one aspect, 
the altered or different activity is a phosphohpase activity under a high temperature, wherem 
the phosphohpase encoded by the template nucleic acid is not active under the high 
temperature. In one aspect, the method is iteratively repeated until a phosphohpase coding 
sequence having an altered codon usage from that of the template nucleic acid is produced. 
The method can be iteratively repeated until a phospholipase gene havmg higher or lower 
level of message expression or stability &om that of the template nucleic acid is produced. 

The invention provides methods for modifying codons in a nucleic acid 
encoding a phospholipase to increase its expression in a host cell, the method comprismg (a) 
providinganucleicacidoftheinventionencodingaphosphoUpase;and, (b) identifyinga 
non-preferred or a less preferred codon in the nucleic acid of step (a) and replacing it with a 
preferred or neutrally used codon encoding liie same amino acid as the replaced codon, 
wherein a preferred codon is a codon over-represented in coding sequences in genes in the 
host cell and a non-preferred or less preferred codon is a codon under-represented in coding 
sequences in genes m the host cell, thereby modifying the nucleic acid to increase its 

expression in a host cell. 

The invention provides methods for modifying codons in a nucleic acid 
encoding aphosphoUpase, the method comprising (a) providing a nucleic acid of the 
inventionencoding aphosphoUpase; and. (b) identiftdng a codon m the nucleic acid of step 
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(a) and replacing it with a different codon encoding the same amino acid as the replaced 
codon, thereby modifying codons in a nucleic acid encoding a phosphoUpase. 

The invention provides methods for modifying codons in a nucleic acid 
encoding a phosphoUpase to increase its expression in a host cell, the method comprising (a) 
providing a nucleic acid of the invention encoding a phosphoUpase; and, (b) identifying a 
non-pieferred or a less preferred codon in the nucleic acid of step (a) and replacing it with a 
preferred or neutrally used codon encoding the same amino acid as the replaced codon, 
wherein a preferred codon is a codon over-represented in coding sequences in genes m the 
host cell and a non-preferred or less preferred codon is a codon under-represented in coding 
sequences in genes in the host cell, thereby modifying the nucleic acid to increase its 

expression in a host cell. 

The invention provides methods for modifying a codon in a nucleic acid 
encoding a phosphoUpase to decrease its expression in a host cell, the method comprising (a) 
providinganucleicacidoftheinventionencodingaphospholipase;and, (b) identifyingat 
least one preferred codon in the nucleic acid of step (a) and replacing it with anon-preferred 
or less preferred codon encoding the same amino acid as the replaced codon, wherem a 
preferred codon is a codon over-represented in coding sequences in genes in a host cell and a 
non-preferred or less preferred codon is a codon imder-represented in coding sequences in 
genes in the host cell, thereby modifying the nucleic acid to decrease its expression in a host 
cell. In alternative aspects, the host cell is abacterial cell, a fungal cell, an insect cell, a yeast 

cell, a plant cell or a mammaUan cell. 

The invention provides methods for producing a Ubrary of nucleic acids 
encoding a pluraUty of modified phosphoUpase active sites or substrate binding sites, wherein 
the modified active sites or substrate binding sites are derived firom a first nucleic acid 
comprising a sequence encoding a first active site or a first substrate binding site the method 
comprising: (a) providing a first nucleic acid encoding a first active site or first substrate 
binding site, wherein the first nucleic acid sequence comprises anucleic acid of the 
invention; (b) providing a set of mutagenic oligonucleotides that encode naturally-occurring 
amino acid variants at a pluraUty of targeted codons in the first nucleic acid; and, (c) using 
the set of mutagenic oUgonucleotides to generate a set of active site-encoding or substrate 
binding site-encoding variant nucleic acids encoding a range of amino acid variations at each 
amino acid codon that was mutagenized, thereby producing a Ubrary of nucleic acids 
encoding a pluraUty of modified phosphoUpase active sites or substrate binding sites. In 

alternative aspects, the method comprises mutagenizing the first nucleic acid of step (a) by a 
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method comprising an optimized directed evolution system, gene site-saturation mutagenesis 
(GSSM). and synthetic ligation reassembly (SLR). The method can further comprise 
mutagenizing the first nucleic acid of step (a) or variants by a method comprising error-prone 
PGR, shuflOing. oUgonucleotide-directed mutagenesis, assembly PGR, sexual PGR 
mutagenesis, in vivo mutagenesis, cassette mutagenesis, recursive ensemble mutagenesis, 
exponential ensemble mutagenesis, site-specific mutagenesis, gene reassembly, gene site 
saturated mutagenesis (GSSM). synliietic Ugation reassembly (SLR) and a combination 
thereof. The method can fUrther comprise mutagenizmg the first nucleic acid of step (a) or 
variants by a method comprising recombination, recursive sequence recombination. 
phospholMoate-modified DNA mutagenesis, uracil-containing template mutagenesis, gapped 
duplex mutagenesis, pomt mismatch repair mutagenesis, repair-deficient host stram 
mutagenesis, chemical mutagenesis, radiogenic mutagenesis, deletion mutagenesis, 
restriction-selection mutagenesis, restriction-purification mutagenesis, artificial gene 
synthesis, ensemble mutagenesis, chimeric nucleic acid multimer creation and a combmation 

thCTeof. . . 

The invention provides methods for making a small molecule compnsmg the 
stepsof: (a) providing a pluraUty of biosynthetic enzymes capable of synthesizing or 
modifying a small molecule, wherein one of the enzymes comprises a phospholipase enzyme 
encodedbyanucleicacidoftheinvention; (b) providing a substrate for at least one of the 
enzymes of step (a); and. (c) reacting the substrate of step (b) with the enzymes under 
conditions tbat faciUtate a pluraUty of biocatalytic reactions to generate a small molecule by a 

series of biocatalytic reactions. 

The invention provides methods for modifying a small molecule compnsmg 
thesteps: (a) providing a phosphoUpaseen^^^e encoded by a nucleic acid of Reinvention; 
(b) providing a small molecule; and. (c) reacting the enzyme of step (a) with the small 
molecule of step (b) mider conditions that facilitate an enzymatic reaction catalyzed by the 
phosphoUpase enzyme, thereby modifying a smaU molecule by a pbosphoUpase enzymatic 
reaction. In one aspect, the method comprises providing a plurality of small molecule 
substrates for the enzyme of step (a), thereby generating a Ubrary of modified small 
3 molecules produced by at least one enzymatic reaction catalyzed by the phosphohpase 

enzyme In one aspect, the method fiirther comprises apluraUty of additional enzymes under 
conditions that faciUtate a pluraUty of biocatalytic reactions by the enzymes to form a Ubrary 
of modified small molecules produced by the pluraUty of enzymatic reactions . In one aspect, 
the method further comprises the step of testing the Ubrary to determine if a particular 
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modified smaU molecule that exhibits a desired activity is present within the Ubraiy. The step 
of testing the library can further comprises the steps of systematically eliminating all but one 
of the biocatalytic reactions used to produce a portion of the pluraUty of the modified small 
molecules within the Ubrary by testing the portion of the modified small molecule for liie 
presence or absence of the particular modified small molecule with a desked activity, and 
identifying at least one specific biocatalytic reaction that produces tiie particular modified 

small molecule of desired activity. 

The invention provides methods for determining a functional fragment of a 
phospholipase enzyme comprising the steps of: (a) providing a phospholipase enzyme 
comprising an amino acid sequence of the mvention; and, (b) deleting a pluraHty of amino 
acid residues from the sequence of step (a) and testing liie remaining subsequence for a 
phosphohpase activity, thereby detennining a functional fragment of aphosphoUpase 
enzyme. In one aspect, the phosphohpase activity is measured by providing a phosphoUpase 
substrate and detecting an increase in the amount of the substirate or a decrease in the amount 
of a reaction product. In one aspect, a decrease in the amount of an enzyme substirate or an 
increase in the amount of tiie reaction product with the test compound as compared to the 
amount of substrate or reaction product without the test compound identifies tiie test 
compound as an activator of phosphohpase activity. 

The invention provides metiiods for cleaving a glycerolphosphate ester hnkage 
comprismg tiie following steps: (a) providing a polypeptide having a phospholipase activity, 
wherein tiie polypeptide comprises an amino acid sequence of tiie invention, or tiie 
polypeptide is encoded by a nucleic acid of tiie invention; (b) providing a composition 
comprismg a glycerolphosphate ester hnkage; and, (c) contactingtiiepolypeptideof step (a) 
witii tiie composition of step (b) under conditions wherein flie polypeptide cleaves tiie 
glycerolphosphate ester linkage, hi one aspect, tiie conditions comprise between about pH 5 
to about 5.5, or, between about pH 4.5 to about 5.0. In one aspect, tiie conditions comprise a 
temperatiire of between about 40°C and about TO^C. Jn one aspect, tiie composition 
comprises a vegetable oil. Li one aspect, tiie composition comprises an oilseed phosphoUpid. 
m one aspect, tiie cleavage reaction can generate a water extiactablephosphorylatedbase 
and a diglyceride. 

The mvention provides metiiods for oil degumming comprising tiie following 

steps: (a) providing a polypeptide having a phosphoUpase activity, wherein tiie polypeptide 

comprises an amino acid sequence of tiie invention, or flie polypeptide is encoded by a 

nucleic acid of tiie mvention; (b) providmg a composition comprising a vegetable oil; and, 
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(c) contacting the polypeptide of step (a) and the vegetable oil of step (b) under conditions 
wherein the polypeptide can cleave ester linkages in the vegetable oil. thereby degunmring 
the oil. hi one aspect, the vegetable oil comprises oilseed. The vegetable oil can comprise 
pahn oil, rapeseed oil. com oil, soybean oil, canola oil, sesame oil, peanut oil or sunflower 
oil. hi one aspect, the method further comprises addition of aphosphoUpase of the 
invention, another phosphoUpase or a combination thereof 

The invention provides methods for convertmg a non-hydratable phosphoUpid 
toahydratableformcomprisingthefollowingsteps: (a) providing a polypeptide having a 
phosphohpase activity, wherein the polypeptide comprises an ammo acid sequence of the 
invention, or the polypeptide is encoded by a nucleic acid of the invention; (b) providing a 
composition comprismg a non-hydratable phosphoUpid; and, (c) contacting the polypeptide 
of step (a) and the non-hydratable phosphoUpid of step (b) under conditions wherem the 
polypeptide can cleave ester linkages in the non-hydratable phosphoUpid, thereby converting 
a non-hydratable phosphoUpid to a hydratable form. 

The invention provides methods for degummmg an oil comprising the 
following steps: (a) providing a composition comprising a polypeptide of the invention 
having a phosphoUpase activity or a polypeptide encoded by a nucleic acid of the invention; 
(b) providing an composition comprising a fat or an oil comprising a phosphoUpid; and (c) 
contacting the polypeptide of step (a) and the composition of step (b) under conditions 
wherein the polypeptide can degum the phosphoUpid-comprising composition (under 
conditions wherem the polypeptide of the invention can catalyze the hydrolysis of a 
phosphoUpid). to one aspect the oil-comprising composition comprises a plant, an animal, an 
algae or a fish oil. The plant oil can comprise a soybean oil, a rapeseed oil, a com oil. an oil 
from a pahn kernel, a canola oil, a sunflower oil, a sesame oil or a peanut oil. The 
polypeptide can hydrolyze a phosphatide from a hydratable and/or a non-hydratable 
phosphoUpid in the oil-comprising composition. The polypeptide can hydrolyze a 
phosphatide at a glyceryl phosphoester bond to generate a diglyceride and water-soluble 
phosphate compound. The polypeptide can have a phosphoUpase C. B. A or D activity. In 
one aspect, a phosphoUpase D activity and a phosphatase enzyme are added. The contacting 
can comprise hydrolysis of a hydrated phosphoUpid in an oil. The hydrolysis conditions of 
can comprise a temperature of about 20°C to 40°C at an alkaUne pH. The alkaUne conditions 
cancompriseapHof aboutpHStopH 10. The hydrolysis conditions can comprise a 
reaction time of about 3 to 10 mmutes. The hydrolysis conditions can comprise hydrolysis of 
hydratable and non-hydratable phosphoUpids in oU at a temperature of about 50°C to 60°C, at 
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a pH of about pH 5 to pH 6.5 using a reaction time of about 30 to 60 minutes. The 
polypeptide can be bound to a filter and the phosphoUpid-containing fat or oil is passed 
through the filter. The polypeptide can be added to a solution comprising the phospholipid- 
containing fat or oil and then the solution is passed through a filter. 

The invention provides methods for converting a non-hydratable phosphoUpid 
to ahydratable form comprising the following steps: (a) providing a composition comprising 
a polypeptide having aphosphoUpase activity of the invention, or a polypeptide encoded by a 
nucleic acid of the invention; (b) providing an composition comprising a non-hydratable 
phosphoUpid; and (c) contacting the polypeptide of step (a) and the composition of step (b) 
under conditions wherein the polypeptide converts the non-hydratable phosphoUpid to a 
hydratable form. The polypeptide can have a phosphoUpase C activity. The polypeptide can 
have a phosphoUpase D activity and a phosphatase enzyme is also added. 

The invention provides methods for caustic refining of a phosphoUpid- 
containing composition comprising the foUowing steps: (a) providing a composition 
comprising a polypeptide of the invention having aphosphoUpase activity, or a polypeptide 
encoded by a nucleic acid of the invention; (b) providing an composition comprising a 
phosphoUpid; and (c) contacting the polypeptide of step (a) with the composition of step (b) 
before, during or after Ihe caustic refining. The polypeptide can have a phosphoUpase C 
activity. The polypeptide can be added before caustic refining and the composition 
comprising the phosphoUpid can comprise a plant and the polypeptide can be expressed 
transgenically in the plant, the polypeptide having a phosphoUpase activity can be added 
during crushing of a seed or other plant part. or. the polypeptide having a phosphoUpase 
activity is added following crushing or prior to refining. TTie polypeptide can be added 
during caustic refining and varying levels of acid and caustic can be added depending on 
levels of phosphon)us and levels office fatty adds. The polypeptide can be added after 
caustic refining: in an intense mixer or retention mixer prior to separation; foUowing a 
heating step; in a centrifiige; in a soapstock; in a washwater; or, during bleaching or 

deodorizing steps. 

The invention provides methods for purification of a phytosterol or a 
triterpene comprising the foUowing steps: (a) providing a composition comprising a 
polypeptide of the invention having a phosphoUpase activity, or a polypeptide encoded by a 
nucleic acid of the invention; (b) providing an composition comprising a phytosterol or a 
triterpene; and (c) contacting the polypeptide of step (a) with Ihe composition of step (b) 
mider conditions wherein tiie polypeptide can catalyze tiie hydrolysis of a phosphoUpid in tiie 
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composition. The polypeptide can have a phospholipaseC activity. lUe phytosterol or a 
triterpene can comprise a plant sterol. The plant sterol can be derived ftom a vegetable o± 
Hxe vegetable oil can comprise a coconut oil, canola oil, cocoa butter oil. com oil. cottonseed 
oil. linseed oil, oUve oil. pahn oil, peanut oil, oil derived from a rice bran, safflower oil. 
sesame oil, soybean oil or a sunflower oU. The method can comprise use of nonpolar 
solvents to quantitatively extract free phytosterols and phytosteryl fatty-acid esters. The 
phytosterol or a triterpene can comprise a p-sitosterol, a campesterol, a stigmasterol, a 
stigmastanol, a p-sitostanol. a sitostanol, a desmosterol. a chalinasterol, a poriferasterol. a 

cUonasterol or a brassicasterol. 

The invention provides methods for refining a crude oil comprising the 
following steps: (a) providing a composition comprising a polypeptide of tiie invention 
having a phospholipase activity, or a polypeptide encoded by a nucleic acid of Ihe invention; 
(b) providing a composition comprising an oil comprising a phospholipid; and (c) contacting 
the polypeptide of step (a) with the composition of step (b) under conditions wherein the 
polypeptide can catalyze tiie hydrolysis of a phosphoUpid in the composition. The 
polypeptide can have a phosphoUpase C activity. The polypeptide can have a phosphohpase 
activity is in a water solution that is added to tiie composition. The water level can be 
between about 0.5 to 5%. THe process time canbe less than about 2 hours, less than about 60 
minutes, less tiian about 30 minutes, less than 15 minutes, or less than 5 minutes. The 
hydrolysis conditions can comprise a tenq>eratiu:e of between about 25°C-70°C. The 
hydrolysis conditions can comprise use of caustics. IHe hydrolysis conditions can comprise a 
pH of between about pH 3 and pH 10. between about pH 4 and pH 9, or between about pH 5 
and pH 8. The hydrolysis conditions can comprise addition of emulsifiers and/or mixing 
after tixe contacting of step (c). The methods can comprise addition of an emulsion-breaker 
and/or heat to promote separation of an aqueous phase. The metiiods can comprise 
degumming before the contacting step to collect lecithin by centiifugation and then adding a 
PLC a PLC and/or a PLA to remove non-hydratablephosphoUpids. The methods can 
comprise water degumming of crude oil to less tiian 10 ppm for edible oils and subsequent 
physical refining to less tiian about 50 ppm for biodiesel oils. Tlie methods can comprise 
, addition of acid to promote hydration of non-hydratable phosphoUpids. 



25 



09010-094001 A .„^,^-^,^- 

wo 03/089620 PCT/US03/12556 
The details of one or more embodimeats of ihe invention are set forth in the 
accompanying drawings and the description below. Other features, objects, and advantages 
of the invention will be apparent from the description and drawings, and from the claims. 

All pubUcations, patents, patent appUcations, GenBank sequences and ATCC 
deposits, cited herein are hereby expressly mcorporated by reference for all purposes. 

BRIEF DESCRIPTION OF THE DRAWINGS 
The following drawings are illustrative of embodiments of the invention and 
are not meant to limit the scope of the invention as encompassed by the claims. 

Figure 1 isablock diagram of acomputer system, as described in detail, 

below. 

Figure 2 is a flow diagram iUustrating one aspect of a process 200 for 
comparing a new nucleotide or protein sequence with a database of sequences in order to 
detennine the homology levels between the new sequence and the sequences in the database, 

as described in detail, below. 

Figure 3 is a flow diagram illustrating one embodiment of a process in a 
computer for determining whether two sequences are homologous, as described in detail, 
below. 

Figure 4 is a flow diagram illustrating one aspect of an identifier process for 
detecting the presence of a feature in a sequence, as described in detail, below. 

Figures 5 A, 5B and 5C schematically illustrate a model two-phase system for 
simulation of PLC-mediated degumming, as described in detail in Example 2, below. 

Figure 6 schematically illustrates an exemplary vegetable oil refining process 

using the phospholipases of the invaition. 

Figure 7 schematically illustrates an exemplary degumming process of the 
invention for physically refined oils, as discussed in detail, below. 

Figure 8 schematically iUustrates phosphatide hydrolysis with a phosphohpase 

C of the invention, as discussed in detail, below. 

Figure 9 schematically illustrates appUcation of a phosphohpase C of the 
invention as a "Caustic Refining Aid" (Long Mix Caustic Refining), as discussed in detail, 
) below. 

Figure 10 schematically illustrates appUcation of a phosphohpase C of the 
invention as a degumming aid, as discussed in detail, below. 

Like reference symbols in the various drawings indicate like elements. 
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DETAILED DESCRIPTION OF THE INVENTION 

The present invention provides phosphoUpases (e.g., phosphoUpase A, B. C, 
D, patatin enzymes), polynucleotides encoding them and methods for making and using 
them. The invention provides enzymes that efficiently cleave glycerolphosphate ester linkage 
in oils, such as vegetable oils, e.g., oilseed phospholipids, to generate a water extractable 
phosphorylatedbase and a diglyceride. In one aspect, the phosphoUpases ofthe invention 
have a Upid acyl hydrolase (LAH) activity. In alternative aspects, the phosphoUpases ofthe 
invention can cleave glycerolphosphate ester linkages in phosphatidylcholine, 
phosphatidylethanolamine, phosphaddylserine and sphingomyeUn. 

A phosphoUpase of the invention (e.g., phosphoUpase A, B, C, D, patatin 
enzymes) can be used for enzymatic degumming of vegetable oils because the phosphate 
moiety is soluble in water and easy to remove. The diglyceride product will remain in Hie oil 
and therefore will reduce losses. The PLCs ofthe invention can be used in addition to or in 
place of PLAls and PLA2s in commercial oil degumming, such as in the ENZYMAX® 
process, where phospholipids are hydrolyzed by PLAl and PLA2. 

In one aspect, the phosphoUpases ofthe invention are active at a high and/or at 
a low temperature, or, over a wide range of temperature, e.g., they can be active in the 
temperatures ranging between 20°C to 90°C, between SO^C to 80°C, or between 40«C to 
70°C. The invention also provides phosphoUpases ofthe invention have activity at alkaline 
pHs or at acidic pHs, e.g.. low water acidity. In alternative aspects, the phosphoUpases of 
tiie invention can have activity in acidic pHs as low as pH 6.5, pH 6.0, pH 5.5. pH 5.0, pH 
4.5. pH 4.0 and pH 3.5. In alternative aspects, the phosphoUpases of tiie invention can have 
activity in alkaUne pHs as high as pH 7.5. pH 8.0. pH 8.5, pH 9.0. and pH 9.5. In one aspect, 
the phosphoUpases ofthe invention are active in the temperature range of between about 
40°C to about 10°C under conditions of low water activity (low water content). 

The invention also provides methods for further modifying the exemplary 
phosphoUpases ofthe invention to generate enzymes with desirable properties. For example. 
phosphoUpases generated by the metiiods of tiie invention can have altered substrate 
specificities, substrate binding specificities, substrate cleavage patterns, tiiemial stabiUty. 
pH/activity profile, pH/stability profile (such as increased stabiUty at low, e.g. pH<6 or pH<5, 
or high. e.g. pH>9. pH values), stabiUty towards oxidation. Ca^^ dependency, specific activity 
andtiieUke. The mvention provides for altering any property of interest. For instance, the 
alteration may result in a variant which, as compared to a parent phosphoUpase, has altered 

tjH and temperature activity profile. 
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In one aspect, the phosphoUpases of the invention arc used in various 
vegetable oil processing steps, such as in vegetable oil extraction, particularly, in the removal 
of ••phosphoUpid gums" in a process called "oil degumming," as described herein. The 
production of vegetable oils from various sources, such as soybeans, rapeseed, peanut, 
sesame, sunflower and com. The phosphoUpase enzymes of the invention can be used in 
place of PLA, e.g.. phospholipase A2. in any vegetable oil processing step. 

Dftfinitions 

The term "phosphoUpase" encompasses enzymes having any phosphoUpase 
activity, for example, cleaving a glycerolphosphate ester Uukage (catalyzing hydrolysis of a 
glycerolphosphate ester linkage), e.g., in an oil, such as a vegetable oU. THe phosphoUpase 
activity of the invention can generate a water extractable phosphorylated base and a 
diglyceride. ITie phosphoUpase activity of the invention also includes hydrolysis of 
glycerolphosphate ester Unkages at high temperatures, low temperatures, alkaUne pHs and at 
acidic pHs. The tenn "a phosphoUpase activity" also includes cleaving a glycerolphosphate 
ester to generate a water extractable phosphorylated base and a diglyceride. The temi "a 
phosphoUpase activity" also includes cutting ester bonds of glycerin and phosphoric acid m 
phosphoUpids. me term "a phosphoUpase activity" also includes other activities, such as the 
abiUty to bind to a subslrate. such as an oil. e.g. a vegetable oil. substrate also including plant 
and animal phosphatidylchoUnes. phosphatidyl-ethanolamines. phosphatidylserines and 
sphingomyeUns. The phosphoUpase activity can comprise a phosphoUpase C (PLQ activity, 
aphosphoUpase A (PLA) activity, such as aphosphoUpase Al or phosphoUpase A2 activity. 
aphosphoUpase B (PLB) activity, such as a phosphoUpase Bl or phosphoUpase B2 activity, a 
phosphoUpase D (PLD) activity, such as aphosphoUpase Dl or aphosphoUpase D2 activity. 
The phospholipase activity can comprise hydrolysis of a glycoprotein, e.g.. as a glycoprotem 
found in a potato tuber or any plant of the genus Solanum, e.g., Solanum tuberosum. The 
phosphoUpase activity can comprise a patatin enzymatic activity, such as a patatin esterase 
activity (see. e.g.. Jimenez (2002) Biotechnol. Prog. 18:635-640). The phosphoUpase activity 
can comprise a Upid acyl hydrolase (LAH) activity. 

The term "antibody" includes a peptide or polypeptide derived from, modeled 
, after or substantiaUy encoded by an immunoglobuUn gene or immunoglobuUn genes, or 
fragments thereoi^ enable of specifically binding an antigen or epitope, see. e.g. 
Fundamental Immunology. Third Edition. W.E. Paul. ed.. Raven Press. N.Y. (1993); Wilson 
(1994) J. hmnunol. Methods 175:267-273; Yarmush (1992) J. Biochem. Biophys. Methods 
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25-85-97. The term antibody includes antigen-binding portions, i.e.. "antigen binding sites," 
(e g , fragments, subsequences, complementarity determining regions (CDRs)) that retain 
capacity to bind antigen, including (i) a Fab fragment, a monovalent fragment consisting of 
the VL. VH, CL and CHI domains; (ii) a F(ab')2 fragment, a bivalent fragment compnsmg 
two Fab fragments hnked by a disulfide bridge at the hinge region; (iii) a Fd fragment 
consisting of tiie VH and CHI domains; (iv) a Fv fragment consisting of the VL and VH 
domains of a single am. of an antibody, (v) a dAb fragment (Ward et al., (1989) Nature 
341-544-546), which consists of a VH domain; and (vi) an isolated complementarity 
determining region (CDR). Single chain antibodies are also included by reference in the term 

"antibody." . . 

The terms "array" or "microarray" or 'Whip" or "chip" as used herem is a 

plurality of target elements, each target element comprising a defined amount of one or more 
polypeptides (including antibodies) or nucleic acids immobilized onto a defined area of a 
substiate surface, as discussed in ftirther detail, below. 

As used herein, the terms "computer," "computer program" and "processor" 
are used in their broadest general contexts and incorporate all such devices, as described in 
detail, below. 

A "coding sequence of or a "sequence encodes" a particular polypeptide or 
protein, is a nucleic acid sequence which is transcribed and translated into a polypeptide or 
protem when placed under the control of appropriate regulatory sequences. 

The term "expression cassette" as used herem refers to a nucleotide sequence 
which is capable of affecting expression of a structural gene (i.e.. aprotem coding sequence, 
such as a phosphohpase of the invention) in a host compatible with such sequences. 
Expression cassettes include at least a promoter operably hnked with the polypeptide codmg 
sequence; and, optionally, with other sequences, e.g.. transcription termination signals. 
Additional factors necessary or helpful in effecting expression may also be used, e.g.. 
enhancers. "Operably hnked" as used herein refers to Unkage of a promoter upstream from a 
DNA sequence such that the promoter mediates transcription of the DNA sequence. Thus, 
expression cassettes also include plasmids. expression vectors, recombinant viruses, any form 
, ofrecombinant "naked DNA" vector, and the Uke. A "vector" comprises a nucleic acid 

which can mfect. trdnsfect, transientiy or permanently transduce a cell. It will be recogmzed 
that a vector can be a naked nucleic acid, or a nucleic acid complexed witii protein or Upid. 
The vector optionally comprises viral or bacterial nucleic acids and/or proteins, and/or 
membranes (e.g.. a cell membrane, a viral Upid envelope, etc.). Vectors include, but are not 
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Ihnited to repUcons (e.g., RNA repUcons, bacteriophages) to which fragments of DNA may 
be attached and become replicated. Vectors thus include, but are not limited to RNA. 
autonomous self-repUcating circular or linear DNA or RNA (e.g., plasmids, viruses, and the 
like see e.g.. U.S. Patent No. 5,217,879), and includes both the expression and non- 
expi^ssion plasmids. Where a recombinant microorganism or cell culture is described as 
hosting an "expression vector" this includes both extra-chromosomal circular and Imear DNA 
and DNA that has been incorporated into the host chromosome(s). Where a vector is bemg 
maintained by ahost cell, the vector may either be stably repUcated by the cells during 
mitosis as an autonomous structure, or is incorporated within the host's genome. 

••Plasmids" are designated by a lower case ' V preceded and/or followed by 
capital letters and/or numbers. The starting plasmids herein are either commercially 
available, publicly available on an unreslricted basis, or can be constructed from available 
plasmids in accord with pubUshed procedures, hi addition, equivalent plasmids to those 
described herem are known in the art and will be apparent to the ordinarily skilled artisan. 

The term "gene" means the segment of DNA involved in producing a 
polypeptide chain, including, inter alia, regions preceding and following the coding region, 
such as leader and trailer, promoters and enhancers, as well as, where appHcable. intervemng 
sequences (introns) between individual coding segments (exons). 

The phrases ''nucleic acid" or "nucleic acid sequence" as used herein refer to 
an oUgonucleotide. nucleotide, polynucleotide, or to a fragment of any of these, to DNA or 
RNA (e g.. mRNA, rRNA. tRNA. iRNA) of genomic or synthetic origin which may be 
single-stranded or double-stranded and may represent a sense or antisense strand, to peptide 
nucleic acid (PNA). or to any DNA-hke or RNA-like material, natural or synthetic m ongm, 
including. e.g., iRNA, ribonucleoproteins (e.g., double stranded iRNAs. e.g., iRNPs). The 
temi encompasses nucleic acids, i.e.. oHgonucleotides. containing known analogues of 
natural nucleotides. The term also encompasses nucleicacid-like structures with synthetic 
backbones, see e.g., Mata (1997) Toxicol. Appl. Pharmacol. 144:189-197; Strauss-Soukup 
(1997) Biochemistry 36:8692-8698; Samstag (1996) Antisense Nucleic Acid Drug Dev 
6:153-156. 

J •'Amino acid" or "amino acid sequence" as used herein refer to an 

oUgppeptide. peptide, polypeptide, or protein sequence, or to a fragment, portion, or subunit 
of any of these, and to naturally occurring or synthetic molecules. 

The terms ••polypeptide" and "protein" as used herein, refer to amino acids 
joined to each other by peptide bonds or modified peptide bonds. i.e.. peptide isosteres. and 
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may contam modified amino acids other than the 20 gene-encoded amino acids. The term 
"polypeptide" also includes peptides and polypeptide ftagments. motife and the like. The 
term also includes glycosylated polypeptides. Hie peptides and polypeptides of the invention 
also include all "mimetic" and "peptidomimetic" fomis, as described in further detail, below. 

As used herein, the term "isolated" means that the material is removed &om its 
origmal enviiomnent (e.g.. the natural enviromnent if it is naturally occurring). For example, 
a naturally occurring polynucleotide or polypeptide present in a living animal is not isolated, 
but the same polynucleotide or polypeptide, separated from some or all of the coexistmg 
materials in the natural system, is isolated. Such polynucleotides could be part of a vector 
and/or such polynucleotides or polypeptides could be part of a composition, and still be 
isolated in that such vector or composition is not part of its natural enviromnent. As us.ed 
herein, an isolated material or composition can also be a "purified" composition, i.e.. it does 
not require absolute purity; rather, it is intended as a relative defimtion. Individual nucleic 
acids obtained from aUbrary canbe convention^y purified to electrophoretic homogeneity, 
m alternative aspects, the invention provides nucleic acids which have been purified from 
genomic DNA or from other sequences in a Ubrary or other enviromnent by at least one, two, 
three, four, five or more orders of magnitude. 

As used herein, the term "recombinant" means that the nucleic acid is adjacent 
to a '-backbone" nucleic acid to which it is not adjacent in its natural enviromnent. In one 
aspect, nucleic acids represent 5% or more of the number of nucleic acid inserts in a 
population of nucleic acid "backbone molecules." "Backbone molecules" according to the 
invention include nucleic acids such as expression vectors, self-repUcating nucleic acids, 
viruses, integrating nucleic acids, and other vectors or nucleic acids used to maintam or 
manipulate a nucleic acid insert of interest. In one aspect, the enriched nucleic acids 
represent 15%, 20%. 30%. 40%, 50%, 60%. 70%. 8O0/0. 90% or more of the number of 
nucleic acidinsertsinthepopulationofrecombinantbackbonemolecules. "Recombmant" 
polypeptides or proteins refer to polypeptides or proteins produced by recombinant DNA 
techniques; e.g., produced from cells transformed by an exogenous DNA construct encodmg 
the desired polypeptide or protein. "Synthetic" polypeptides or protein are those prepared by 

) chemical synthesis, as described in further detail, below. 

A promoter sequence is "operably linked to" a coding sequence when RNA 
polymerase which initiates transcription at the promoter will transcribe the coding sequence 
into mRNA, as dbcussed further, below. 
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"OUgonucleotide" refers to either a single stranded polydeoxynucleotide or 
two complementary polydeoxynucleotide strands which may be chemically synthesized. 
Such synthetic oUgonucleotides have no 5' phosphate and thus will not Ugate to another 
oligonucleotide without adding a phosphate with an ATP in the presence of a kmase. A 
synthetic oUgonucleotide will hgate to a fragment that has not been dephosphorylated. 

The phrase "substantially identical" in the context of two nucleic acids or 
polypeptides, refers to two or more sequences that have at least 50%, 60%, 10%, 75%, 8O0/0, 
850/0, 90%, 95%, 98% or 99% nucleotide or amino acid residue (sequence) identity, when 
compared and aligned for maximum correspondence, as measured using one any known 
sequence comparison algorithm, as discussed in detail below, or by visual mspection. In 
alternative aspects, the invention provides nucleic acid and polypeptide sequences havmg 
substantial identity to an exemplary sequence of the mvention, e.g., SEQ ID NO:l. SEQ ID 
NO-2 SEQ ID NO:3, SEQ ID NO:4. SEQ ID N0:5. SEQ ID NO:6, SEQ ID NO:7, SEQ ID 
NO-8 etc., over a region of at least about 100 residues, 150 residues. 200 residues, 300 
residues, 400 residues, or a region rangmg from between about 50 residues to the full length 
of the nucleic acid or polypeptide. Nucleic acid sequences of the invention can be 
substantially identical over the entire length of a polypeptide coding region. 

Additionally a "substantially identical" amino acid sequence is a sequence that 
differs from a reference sequence by one or more conservative or non-conservative amino 
acid substitutions, deletions, or insertions, particularly when such a substitution occurs at a 
site that is not the active site of the molecule, and provided that the polypeptide essentially 
retains its functional properties. A conservative amino acid substitution, for example, 
substitutes one amino acid for another of the same class (e.g., substitixtion of one hydrophobxc 
amino acid, such as isoleucine, valine, leucine, or methionine, for another, or substitution of 
one polar amino acid for another, such as substitiition of arginine for lysine, glutamic acid for 
aspartic acid or glutamine for asparagine). One or more ammo acids can be deleted, for 
example, from a phosphoUpase polypeptide, resulting m modification of the stinictiire of the 
polypeptide, without significantiy altering its biological activity. For example, amino- or 
caiboxyl-temnnal amino acids that are not required for phosphoUpase biological activity can 
, be removed. Modified polypeptide sequences of the invention can be assayed for 
phosphoUpase biological activity by any number of methods, including contacting the 
modified polypq,tide sequence with a phosphoUpase substirate and determining whetiier tiie 
modified polypeptide decreases the amount of specific substirate m the assay or increases the 
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bioproducts of the en^atic reaction of a fimctional phospholipase with the substrate, as 

discussed fiarther, below. 

'•Hybridization" refers to the process by which a nucleic add strand joins with 
a complementary strand through base pairing. Hybridization reactions can be sensitive and 
selective so that a particular sequence of interest can be identified even in samples in which it 
is present at low concentrations. Suitably stringent conditions can be defined by, for 
example, the concentrations of salt or formamide in the prehybridization and hybridization 
solutions, or by the hybridization temperature, and are well known in the art. For example, 
stringency can be increased by reducing the concentration of salt, increasing the 
concentration of formamide, or raising the hybridization temperature, altering the time of 
hybridization, as described in detail, below. In alternative aspects, nucleic acids of the 
invention are defined by their abiUty to hybridize under various stringency conditions (e.g., 

high, medium, and low), as set forth herdn. 

The term "variant" refers to polynucleotides or polypeptides of the inveation 
modified at one or more base pairs, codons. introns, exons, or amino acid residues 
(respectively) yet still retain the biological activity of a phosphoUpase of the invention. 
Variants can be produced by any number of means included metHods such as, for example. 
enor-pronePCR, shuffling, oligonucleotide-directed mutagenesis, assembly PGR, sexual 
PGR mutagenesis, in vivo mutagenesis, cassette mutagenesis, recursive ensemble 
mutagenesis, exponential ensemble mutagenesis, site-specific mutagenesis, gene reassembly. 
GSSM and any combination thereof. Techniques for producing variant phosphohpases 
having activity at apH or temperature, for example, that is different fiom a wild-type 
phospholipase. are included herein. 

The tenn "satiiration mutagenesis" or "GSSM" includes a method lhat uses 
degenerate oUgonucleotide primers to introduce point mutations into a polynucleotide, as 

described in detail, below. 

The term "optimized directed evolution system" or "optimized directed 
evolution" indudes a method for reassembling fragments of related nucldc acid sequences. 

e g related genes, and explained in detail, below. 
, " ' Theterm"syntheticUgationreassembiy'or"SLR"includesamethodof 

Ugating oligonucleotide fragments in a non-stochastic fashion, and explained in detail, below. 
<7^^>>rgtin p and Man ipulating Nucldc Acids 
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The invention provides nucleic acids (e.g., the exemplary SEQ ID NO:l. SEQ 
ID NO-3 SEQ ID NO:5, SEQ ID NO:7, SEQ ID NO:9, SEQ ID NO:ll. SEQ ID NO:13, 
SEQ ID NO:15, SEQ ID NO:17, SEQ ID N0:19, SEQ E) NO:21, SEQ ID NO:23, SEQ ID 
NO-25 SEQ ID NO:27, SEQ ID NO:29, SEQ ID NO:31, SEQ ID NO:33, SEQ ID NO:35. 
SEQ ID NO:37, SEQ ID NO:39, SEQ ID N0:41. SEQ ID NO:43, SEQ ID NO:45, SEQ ID 
NO-47 SEQ ID NO:49. SEQ ID NO:51, SEQ ID NO:53, SEQ ID NO:55, SEQ ID NO:57. 
SEQIDNO:59.SEQIDNO:61,SEQIDNO:63,SEQIDNO:65,SEQ1DNO:67,SEQID 

NO-69 SEQIDNO:71. SEQIDNO:73, SEQIDNO:75, SEQ IDNO:77. SEQIDNO:79, 
SEQIDNO:81,SEQIDNO:83.SEQIDNO:85.SEQIDNO:87,SEQIDNO:89,SEQ1D 

NO-91.SEQIDNO:93,SEQIDNO:95.SEQIDNO:97,SEQIDNO:99,SEQIDNO:101, 
SEQ ID NO-103. SEQ ID NO:105), including expression cassettes such as expression 
vectors, encoding the polypeptides and phosphoUpases of the invention. Il.e invention also 
includes methods for discovering new phosphohpase sequences using the nucldc acxds of the 
invention. Also provided are methods for modifying the nucleic acids of the invention by. 
e.g.. synthetic Ugation reassembly, optimized directed evolution system and/or saturation 

mutagenesis. . i ^ j 

The nucleic acids of tiie invention can be made, isolated and/or mampulated 

by e g clonmg and expression of cDNA Ubraries, amphfication of message or genomic 
DNAby PGR, and tiie like, m practicing the methods of the invention, homologous genes 
canbemodifiedby manipulating atemplatenucleic acid, as described herein. Ite mvention 
can be practiced in conjunction with any method or protocol or device known m the art, 
which are weU described in the scientific and patent Uterature. 

General Techniques 

The nucleic acids used to practice this invention, whether RNA. iKNA, 
antisense nucleic acid, cDNA, genomic DNA, vectors, viruses or hybrids thereof may be 
isolated ftom a variety of sources, genetically engineered, amplified, and/or expressed/ 
generated recombinantiy. Recombinant polypeptides generated firom these nucleic acids can 
be individually isolated or cloned and tested for a desired activity. Any recombmant 
expression system can be used, including bacterial, mammalian, yeast, insect or plant cell 

} expression systems. 

Alternatively, these nucleic acids can be synthesized in vitro by well-known 

chemical synthesis techniques, as described in. e.g.. Adams (1983) J. Am. Chem. Soc. 
105:661; Belousov (1997) Nucleic Acids Res. 25:3440-3444; Freidcel (1995) Free Radio. 
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Biol. Med. 19:373-380; Blommers (1994) Biochemistry 33:7886-7896; Narang (1979) Meth. 
Bnzymol. 68:90; Brown (1979) Meth. Enzymol. 68:109; Beancage (1981) Tetra. Lett. 

22:1859; U.S. Patent No. 4,458,066. 

Techniques for the manipulation of nucleic acids, such as. e.g., subdoning. 
labeling probes (e.g., random-primer labeling using Klenow polymerase, nick translation, 
amplification), sequencing, hybridization and the like are well described in the scientific and 
patent Utexature. see. e.g., Sambrook. ed.. Molecular Cloning: a Laboratory Manual 
(2ND ED.). Vols. 1-3. Cold Spring Harbor Laboratory. (1989); Current Protocols in 
MolbcularBiology. Ausubel. ed. John Wiley & Sons, Inc.. New York (1997); 
LaboratoryTechniquesinBiochemi^yandMolecularBiology: hybridization 

Wrra nucleic Acid Probes. Part L Theory and Nucleic Acid Preparation. Tijssen, ed. 

Blsevier,N.Y. (1993). 

Another useM means of obtaining and manipulating nucleic acids used to 

practice the methods of the invention is to clone firom genomic samples, and. if desired, 

screen and re-clone inserts isolated or ampUfied fi:om. e.g., genomic clones or cDNA clones. 

Sources of nucleic acid used in the methods of the invention include genomic or cDNA 

Ubraries contained in, e.g., mammalian artificial chromosomes (MACs). see, e.g., U.S. Patent 

Nos. 5.721,118; 6.025,155; human artificial chromosomes, see, e.g., Rosenfeld (1997) Nat. 

Genet 'l5'333-335; yeast artificial chromosomes (YAC); bacterial artificial chromosomes 

(BAG); PI artifici^ chromosomes, see, e.g.. Woon (1998) Genomics 50:306-316; Pl-derived 

vectors (PACs). see. e.g.. Kem (1997) Biotechniques 23:120-124; cosmids. recombinant 

viruses, phages or plasmids. 

In one aspect, a nucleic acid encoding a polypeptide of the invention is 
assembled in appropriate phase with a leader sequence capable of directing secretion of the 

translated polypeptide or firagment tiiereof 

The inventionprovidesfiisionproteinsandnucleic acids encodingthem. A 

polypeptide of the invention can be fiised to a heterologous peptide or polypeptide, such as 
N-terminal identification peptides which impart desired characteristics, such as increased 
stabiUty or simplified purification. Peptides and polypeptides of the invention can also be 
, synthesized and expressed as fiision proteins with one or more additional domains linked 
thereto for. e.g., producing a more immimogenic peptide, to more readily isolate a 
recombinantiy syntiiesized peptide, to identify and isolate antibodies and antibody-expressing 
B cells, and flie like. Detection and purification feciUtating domains include, e.g., metal 
chelating peptides such as polyhistidine tracts and histidine-tryptophan modules that allow 
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purification on immobilized metals, protein A domains that allow purification on 
inunobilized immunoglobulin, and the domain utilized in the FLAGS extension/afBnity 
purification system (Immunex Corp, Seattle WA). The inclusion of a cleavable linker 
sequences such as Factor Xa or enterokinase (Invitrogen, San Diego CA) between a 
purification domain and the motif-comprising peptide or polypeptide to faciUtate purification. 
For example, an expression vector can include an epitope-encoding nucleic acid sequence 
linked to six histidine residues followed by a thioredoxin and an enterokinase cleavage site 
(see e.g., Williams (1995) Biochemistry 34:1787-1797; Dobeli (1998) Protein Expr. Purif. 
12:404-414). The histidine residues faciUtate detection and purification while the 
enterokinase cleavage site provides a means for purifying the epitope firom the remainder of 
the fusion protein. Technology pertaining to vectors encoding fiision proteins and appUcation 
of fusion protems are well described in the scientific and patent Uterature, see e.g.. Kroll 
(1993) DNA Cell. Biol., 12:441-53. 

Transcriptional and tramlational control sequences 

The invention provides nucleic acid (e.g., DNA) sequences of the invention 
operatively linked to expression (e.g., transcriptional or translational) control sequence(s),. 
e.g., promoters or enhancers, to direct or modulate RNA synthesis/ expression. The 
expression control sequence can be in an expression vector. Exemplary bacterial promoters 
include laci, lacZ, T3. T7, gpt, lambda PR, PL and trp. Exemplary eukaryotic promoters 
include CMV immediate early, HSV thymidine kinase, early and late SV40. LTRs bom 
retrovirus, and mouse metallothionein I. 

Promoters suitable for expressing a polypeptide in bacteria include the E. coli 
lac or trp promoters, the lacI promoter, the lacZ promoter, the T3 promoter, the T7 promoter, 
the gpt promoter, the lambda PR promoter, the lambda PL promoter, promoters bom operons 
encoding glycolytic enzymes such as 3-phosphoglycerate kinase (PGK), and the acid 
phosphatase promoter. Eukaryotic promoters include the CMV immediate early promoter, 
the HSV thymidine kinase promoter, heat shock promoters, the early and late SV40 promoter, 
LTRs from retroviruses, and the mouse metallothionein-I promoter. Other promoters known 
to control expression of genes in prokaryotic or eukaryotic cells or their viruses may also be 
used. 

Expression vectors and cloning vehicle 

The invention provides expression vectors and cloning vehicles comprising 
nucleic acids of the invention, e.g., sequences encoding the phosphoUpases of the invention. 
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Expression vectors and cloning vehicles of the invention can comprise viral particles, 
baculovirus, phage, plasmids. phagemids, cosmids, fosmids, bacterial artificial chromosomes, 
viral DNA (e.g., vaccinia, adenovirus, foul pox virus, pseudorabies and derivatives of SV40), 
Pl-based artificial chromosomes, yeast plasmids, yeast artificial chromosomes, and any other 
vectors specific for specific hosts of interest (such as bacillus, AspergiUus and yeast). 
Vectors of Ihe invention can include chromosomal, non-chromosomal and synthetic DNA 
sequences. Large numbers of suitable vectors are known to those of skill in the art, and are 
commercially available. Exemplary vectors are mclude: bacterial: pQE vectors (Qiagen), 
pBluescript plasmids, pNH vectors, Oambda-ZAP vectors (Stratagene); ptrc99a, pKK223-3, 
pDR540. pRIT2T (Phaimacia); Eukaryotic: pXTl, pSG5 (Stratagene), pSVK3, pBPV, 
pMSG, PSVLSV40 (Pharmacia). However, any other plasmid or other vector may be used so 
long as' they are repUcable and viable in the host Low copy number or high copy number 
vectors may be employed with the present mvention. 

The expression vector may comprise a promoter, a ribosome-binding site for 
translation initiation and a transcription terminator. The vector may also include appropriate 
sequences for amplifying expression. Mammalian expression vectors can comprise an origin 
of repUcation. any necessary ribosome binding sites, a polyadenylation site, spUce donor and 
acceptor sites, transcriptional termination sequences, and 5' flanking non-transcribed 
sequences. In some aspects, DNA sequences derived from the SV40 spUce and 
polyadenylation sites may be used to provide the required non-transcribed genetic elements. 

In one aspect, the expression vectors contain one or more selectable marker 
genes to permit selection of host cells containing the vector. Such selectable markers include 
genes encoding dihydrofolate reductase or genes conferring neomycin resistance for 
eukaryotic cell culture, genes conferring tetracychne or ampicillin resistance in E. coli, and 
the S. cerevisiae TRPl gene. Promoter regions can be selected from any desired gene using 
chloramphenicol transferase (CAT) vectors or oilier vectors with selectable markers. 

Vectors for expressing the polypeptide or fragment thereof in eukaryotic cells 
may also contain enhancers to increase expression levels. Enhancers are cis-acting elements 
of DNA, usually from about 10 to about 300 bp in length that act on a promoter to increase its 
transcription. Examples include the SV40 enhancer on the late side of the repUcation origin 
bp 100 to 270, the cytomegalovirus early promoter enhancer, the polyoma enhancer on the 
late side of the repUcation origin, and the adenovirus enhancers. 

A DNA sequence may be inserted into a vector by a variety of procedures. In 
general, the DNA sequence is Ugated to the d^ired position in the vector foUowing digestion 
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of the insert and the vector with appropriate restriction endonucleases. Alternatively, blimt 
ends in both the insert and the vector may be ligated. A variety of cloning techniques are 
known in the art. e.g.. as described in Ausubel and Sambrook. Such procedures and others 
are deemed to be within the scope of those skilled in the art. 

The vector may be in the form of a plasmid, a viral particle, or a phage. Other 
vectors include chromosomal, non-chromosomal and synthetic DNA sequences, derivatives 
of SV40; bacterial plasmids, phage DNA. baculovirus, yeast plasmids. vectors derived from 
combinations of plasmids and phage DNA, viral DNA such as vaccinia, adenovirus, fowl pox 
virus, and pseudorabies. A variety of cloning and expression vectors for use with prokaryotic 
and eukaryotic hosts are described by. e.g.. Sambrook. 

Particular bacterial vectors which may be used include the commercially 
available plasmids comprising genetic elements of the well known cloning vector pBR322 
(ATCC 37017). PKK223-3 (Pharmacia Fine Chemicals. Uppsala. Sweden), GEMl (E»romega 
Biotec, Madiso'n, WI. USA) pQE70, pQE60, pQE-9 (Qiagen). pDlO, psiX174 pBluescript U 
KS, pNHSA, pNH16a. pNHlSA, pNH46A (Stratagene). ptrc99a. pKK223-3, pKK233-3. 
PDR540, PRIT5 (Pharmacia). pKK232-8 and pCM7. Particular eukaryotic vectors include 
PSV2CAT. POG44. pXTl. pSG (Stratagene) pSVK3. pBPV, pMSG. and pSVL (Pharmacia). 
However, any other vector may be used as long as it is i^pUcable and viable in the host cell. 
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Host cells and transformed cells 

The invention also provides a transformed cell comprising a nucleic acid 
sequence of the invention, e.g., a sequence encoding a phosphoUpase of the invention, a 
vector of the invention. The host cell may be any of the host cells familiar to those skilled m 
the art. including prokaryotic cells, eukaryotic cells, such as bacterial cells, fimgal cells, yeast 
cells, mammalian cells, insect cells, or plant ceUs. Exemplary bacterial cells include coli, 
Sireptomyces, Bacillus subtilis. Salmonella typhimurium and various species within the 
genera Pseudomonas, Streptomyces, and Staphylococcus. Exemplary insect ceUs include 
DrosophOa S2 and Spodoptera S0. Exemplary animal cells include CHO, COS or Bowes 
melanoma or any mouse or human ceU line. Tlie selection of an appropriate host is within the 

abilities of those skilled in the art. 

The vector may be introduced into the host cells using any of a variety of 
techniques, including transformation, transfection. transduction, viral infection, gene guns, or 
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Ti-mediated gene transfer. Particular methods include calcium phosphate transfection, 
DEAE-Dextran mediated transfection, Upofection, or electropoiation (Davis. L., Dibner, M., 
Battey, I., Basic Methods in Molecular Biology, (1986)). 

Where appropriate, the engineered host cells can be cultured in conventional 
nutrient media modified as appropriate for activating promoters, selecting transformants or 
amplifying the genes of the invention. Following transformation of a suitable host strain and 
growth of the host strain to an appropriate cell density, the selected promoter may be induced 
by appropriate means (e.g., tenq)erature shift or chemical induction) and the ceUs may be 
cultured for an additional period to allow them to produce the desired polypeptide or 
fragment thereof. 

CeUs can be harvested by centrifagation, disrupted by physical or chemical 
means, and the resulting crude extract is retained for forther purification. Microbial cells 
employed for expression of proteins can be disnq>tedby any convenient method, including 
freeze-thaw cycling, sonication, mechanical disruption, or use of cell lysing agents. Such 
methods are well known to those skilled in the art. The expressed polypeptide or fiagment 
thereof can be recovered and purified from recombinant cell cultures by methods including 
ammonium sulfate or ethanol precipitation, acid extraction, anion or cation exchange 
chromatography, phosphocellulose chromatography, hydrophobic interaction 
chromatography, affinity chromatography, hydroxylapatite chromatography and lectin 
chromatography. Protein refolding steps can be used, as necessary, in completing 
configuration of tiie polypeptide. If desired, high performance liquid chromatography 
(HPLQ can be employed for final purification steps. 

Various mammaHan cell cultinre systems can also be employed to express 
recombinant protein. Examples of mammalian expression systems include the COS-7 lines 
of monkey kidney fibroblasts and otiier ceU lines enable of expressing proteins from a 
compatible vector, such as the C127. 3T3, CHO, HeLa and BHK ceU lines. 

The constixcts in host cells can be used in a conventional manner to produce 
the gene product encoded by the recombinant sequence. Depending upon the host employed 
in a recombinant production procedure, the polypeptides produced by host cells containing 
the vector may be glycosylated or may be non-glycosylated. Polypeptides of tiie invention 
may or may not also include an initial methionine amino acid residue. 

Cell-free translation systems can also be employed to produce a polypeptide of 
the invention. CeU-fi^ translation systems can use mRNAs transcribed from a DNA 
construct comprising a promoter operably linked to a nucleic acid encoding the polypeptide 
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or fragment thereof. In some aspects, the DNA construct may be linearized prior to 
conducting an in vitro transcription reaction. The transcribed mRNA is then incubated with 
an appropriate cell-free translation extract, such as a rabbit reticulocyte extract, to produce 
the desired polypeptide or fragment thereof 

The expression vectors can contain one or more selectable marker genes to 
provide a phenotypic trait for selection of transfonned host ceUs such as dihydrofolate 
reductase or neomycin resistance for eukaryotic cell culture, or such as tetracycline or 
ampicillin resistance in E. coli. 

AtTi plificatinn of Nuc leic Acids 

In practicing the invention, nucleic acids encoding the polypeptides of flie 
invention, or modified nucleic acids, can be reproduced by, e.g., amplification. The invention 
provides amplification primer sequence pairs for amplifying nucleic acids encoding 
polypeptides with a phosphoUpase activity. In one aspect, the primer pairs are capable of 
anq)lifying nucleic acid sequences of the mvention, e.g., including the exemplary SEQ ID 
NO:l. or a subsequence thereof; a sequence as set forth in SEQ ID NO:3, or a subsequence 
thereof; a sequence as set forth in SEQ ID NO:5, or a subsequence thereof; and, a sequence 
as set forth in SBQ ID NO:7, or a subsequence thereof, etc. One of skill m the art can design 
ampUfication primer sequence pairs for any part of or the flill length of these sequences. 

The invention provides an amplification primer sequence pair for amplifying a 
nucleic acid encoding apolypeptide having aphosphoUpase activity, wherein the primer pair 
is capable of amplifying a nucleic add comprising a sequence of the invention, or fragments 
or subsequences thereof One or each member of the amplification primer sequence pair can 
comprise an oligonucleotide comprising at least about 10 to 50 consecutive bases of the 
sequence, or about 12, 13, 14, 15, 16, 17, 18. 19, 20, 21, 22. 23, 24. or 25 consecutive bases 
of the sequence. 

The invention provides ampUfication primer pairs, wherein the primer pair 
comprisesafiist member havingasequence as set forth by about the first (theSO 12, 13, 14, 

15, 16, 17. 18, 19. 20. 21. 22. 23. 24, or 25 residues of a nucleic acid of the invention, and a 
sewndm^berhavingasequence as set forth by about the first (theS') 12, 13, 14, 15, 16, 
17. 18, 19, 20, 21, 22, 23, 24, or 25 residues of the complementary strand of the first member. 
Tte invention provides phosphoUpases generated by amplification, e.g.. polymerase chain 
reaction (PGR), using an amplification primer pair of the invention. The invention provides 
methods of making a phospholipase by ampUfication, e.g., polymerase chain reaction (PGR), 
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using an amplification primer pair of the invention. In one aspect, the amplification primer 
pair amplifies a nucleic acid firom a Ubrary, e.g.. a gene Ubrary, such as an enviromnental 
library. 

Amplification reactions can also be used to quantify the amount of nucleic 
acid in a sample (such as the amount of message in a cell sample), label the nucleic acid (e.g., 
to ^ly it to an array or a blot), detect the nucleic acid, or quantify the amount of a specific 
nucleic acid in a sample. In one aspect of the invention, message isolated firom a cell or a 
cDNA Ubrary are amplified. The skilled artisan can select and design suitable 
oUgonucleotide amplification primers. AmpUfication methods are also well known in the art. 
and include, e.g., polymerase chain reaction. PGR (see. e.g.. PGR PROTOCOLS, A GUIDE 
TO METHODS AND APPLICATIONS, ed. Imris, Academic Press. N.Y. (1990) and PGR 
STRATEGIES (1995), ed. Imiis, Academic Press. Inc., N.Y.. Ugase chain reaction (LCR) 
(see, e.g., Wu (1989) Genomics 4:560; Landegren (1988) Science 241:1077; Barringer 
(1990) Gene 89:117); transcription ampUfication (see, e.g., Kwoh (1989) Proc. Natl. Acad. 
Sci. USA 86:1173); and, self-sustained sequence repUcation (see, e.g., Guatelli (1990) Proc. 
Nati. Acad. Sci. USA 87:1874); Q Beta replicase amplification (see, e.g.. Smith (1997) J. 
Clin. Microbiol. 35:1477-1491), automated Q-beta repUcase amplification assay (see, e.g.. 
Burg (1996) Mol. Cell. Probes 10:257-271) and other RNA polymerase mediated techniques 
(e g., NASBA. Cangene, Mississauga, Ontario); see also Berger (1987) Methods Enzymol. 
152:307-316; Sambrook; Ausubel; U.S. Patent Nos. 4.683,195 and 4.683.202; Sooknanan 
(1995) Biotechnology 13:563-564. 

r>.»tPimiiniti p the deg ree of sequence identity 

The invention provides nucleic acids comprising sequences having at least 
about 50%, 510/0, 520/0, 53o/o, 54o/o, 55o/o, 56o/o, 570/o, 58o/o, 59o/o, 6O0/0, 6I0/0, 62o/o. 63o/o, 640/0. 
650/0. 660/0. 670/0, 680/0, 690/0, 70o/o, 71o/o, 720/o, 730/0. 740/0, 75o/o, 76%, 77o/o, 78o/o, 79%, 8O0/0, 
8I0/0. 820/0. 83%. 840/0. 850/0. 860/0, 87o/o, 88%, 89o/o, 90o/o, 91o/o, 92o/o, 93o/o, 94%, 95%, 96%, 
970/0. 980/0, 990/0, or more, or complete (lOOo/o) sequence identity to an exemplary nucleic acid 
of the invention (e.g.. SEQ ID NO:l, SEQ ID NO:3. SEQ ID NO:5, SEQ ID NO:7, SEQ ID 
NO:9. SEQ ID NO:ll. SEQ IDNO:13. SEQ ID NO:15. SEQ ID NO:17, SEQ ID NO:19, 
SEQ ID NO:21. SEQ ID NO:23, SEQ ID NO:25. SEQ ID N0.27, SEQ ID NO:29, SEQ ID 
NO:31. SEQ ID NO:33. SEQ ID NO:35. SEQ ID NO:37, SEQ ID NO:39. SEQ ID NO:41, 
SEQ ID NO:43. SEQ ID NO:45. SEQ ID NO:47. SEQ ID N0.49, SEQ ID N0:51. SEQ ID 
NO:53. SEQ ID NO:55. SEQ ID NO:57. SEQ ID NO:59. SEQ ID NO:61. SEQ ID NO:63. 
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SEQ ID NO:65, SEQ ID NO:67, SEQ ID NO:69. SEQ ID NO:71. SEQ ID NO:73. SEQ ID 
NO-75 SEQ ID NO:77, SEQ ID NO:79, SEQ ID NO:81, SEQ ID NO:83, SEQ ID NO:85. 
SEQ ID NO:87, SEQ ID NO:89, SEQ ID NO:91 . SEQ ID NO:93. SEQ ID NO:95. SEQ ID 
NO-97 SEQ ID NO:99, SEQ ID NO:101. SEQ ID NO:103, SEQ ID NO:105, and nucleic 
acids ^xcoding SEQ ID NO:2, SEQ ID NO:4, SEQ ID N0:6, SEQ ID NO:8, SEQ ID NO:10, 
SEQ ID NO:12. SEQ ID NO:14, SEQ ID N0;16, SEQ ID NO:18, SEQ ID NO:20. SEQ ID 
NO-22 SEQ ID NO:24. SEQ ID NO:26, SEQ ID NO:28, SEQ ID NO:30. SEQ ID NO:32. 
SEQ NO:34, SEQ ID NO:36, SEQ ID NO:38. SEQ ID NO:40, SEQ ID NO:42, SEQ ID 
NO-44, SEQ ID NO:46. SEQ ID NO:48, SEQ ID NO:50. SEQ ID NO:52, SEQ ID NO:54, 
SEQ ID NO:56. SEQ ID NO:58. SEQ ID NO:60, SEQ ID NO:62. SEQ ID NO:64, SEQ ID 
NO-66 SEQ ID NO:68. SEQ ID NO:70. SEQ ID NO:72. SEQ ID NO:74, SEQ ID NO:76. 
SEQ ID NO:78. SEQ ID NO:80. SEQ ID NO:82, SEQ ID NO:84, SEQ ID NO:86, SEQ ID 
NO:88, SEQ ID NO:90, SEQ ID NO:92, SEQ ID NO:94, SEQ ID NO:96. SEQ ID NO:98, 
SEQ ID NOIOO. SEQ ID NO:102, SEQ ID NO:104. SEQ ID NO:106) over aregion of at 
; least about 50, 75, 100, 150, 200, 250, 300, 350, 400, 450, 500, 550, 600, 650. 700. 750. 800. 
850. 900. 950. 1000. 1050. 1100. 1150. 1200, 1250, 1300. 1350. 1400, 1450. 1500. 1550 or 
more residues. The invention provides polypeptides comprising sequences having at least 
aboui 50%. 510/0. 52%. 53%. 54o/o. 55%, 56%, 57o/o, 58o/o, 59%, 60%, 61%, 62%, 63%. 64o/o, 
65% 660/0. 670/0. 680/0. 69o/o. 70o/o. 71o/o. 72o/o. 73o/o. 74%. 75o/o, 76%, 77%, 78%. 79o/o. 8O0/0. 
0 8I0/0. 820/0. 830/0. 840/0. 850/0. 860/0. 87o/o. 880/0. 890/0, 90o/o. 91%. 92o/o. 93%. 94%. 95o/o. 96%. 
970/0, 980/0. 990/0. or more, or complete (100%) sequence identity to an exemplary polypeptide 
of the inv«ition. Hie extent of sequence identity (homology) may be determined using any 
computer program and associated parameters, including those described herein, such as 
BLAST 2.2.2. or FASTA version 3.0t78, wilh the defeult parameters. 
^5 In alternative embodiments, the sequence identify can be over a region of at 

least about 5. 10. 20, 30, 40, 50. 100. 150. 200. 250. 300. 350. 400 consecutive residues, or 
the full length of the nucleic acid or polypeptide. The extent of sequence identity (homology) 
may be determined using any computer program and associated parameters, uxcluding those 
described herein, such as'BLAST 2.2.2. or FASTA version 3.0t78. with the default 
30 parameters. 

Homologous sequences also include RNA sequences in which uridines replace 
the thymines m the nucleic acid sequences. The homologous sequences may be obtained 
using any of the procedures described herein or may result from the correction of a 
sequencing error. It will be appreciated that the nucleic acid sequences as set forth herein can 
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be represented in the traditional single character format (see. e.g.. Stryer. Lubert. 

Biochenustry. 3rd Ed., W. H Freeman & Co., New York) or in any other format which records 

Ihe identity of the nucleotides in a sequence. 

Various sequence comparison programs identified herein are used in this 
aspect of the invention. Protein and/or nucleic acid sequence identities (homologies) may be 
evaluated using any of the variety of sequence comparison algorithms and programs known 
in the art. Such algorithms and programs include, but are not limited to, TBLASTN, 
BLASTP FASTA, TFASTA, and CLUSTALW (Pearson and Lipman, Proc. Natl. Acad. Sci. 
USA85(8):2444-2448, 1988; Altschul et al.. J. Mol. Biol. 215(3):403-410, 1990; Thompson 
et al.. Nucleic Acids Res. 22(2):4673-4680. 1994; Higgins et al.. Methods Bnzymol. 266:383- 
402, 1996; Altschul et al., J. Mol. Biol. 215(3):403.410, 1990; Altschul et al.. Nature 

Genetics 3:266-272. 1993). 

Homology or identity can be measured using sequence analysis software (e.g.. 
Sequence Analysis Software Package of the Genetics Computer Group. University of 
Wisconsin Biotechnology Center, 1710 University Avenue, Madison. WI 53705). Such 
software matches similar sequences by assigning degrees of homology to various deletions, 
substitutions and other modifications. The terms "homology" and "identity" in the context of 
two or more nucleic acids or polypeptide sequences, refer to two or more sequences or 
subsequences that are the same or have a specified percentage of amino acid residues or 
nucleotides &at are the same when compared and aligned for maximum correspondence over 
a comparison window or designated region as measured using any number of sequence 
comparisonalgorithmsorbymanualalignmentandvisualinspection. Forsequence 
comparison, one sequence can act as a reference sequence (an exemplary sequence SEQ ID 
NO:l. SEQ ID N0.2, SEQ ID NO:3. SEQ ID NO:4. SEQ ID NO:5, SEQ ID NO:6, SEQ ID 
NO-7 SEQ ID N0.8, etc.) to which test sequences are compared. When using a sequence 
comparison algorithm, test and refermce sequences are entered into a computer, subsequence 
coordinates are designated, if necessary, and sequence algorithm program parameters are 
designated. Default program parameters can be used, or alternative parameters can be 
designated. The sequence comparison algorithm then calculates the percent sequence 
, identities for the test sequences relative to the reference sequence, based on the program 
parameters. 

A"comparison window", as used herein, includes reference to a segment of 
any one of the number of contiguous residues. For example, in alternative aspects of the 
invention, continugous residues ranging anywhere fix,m 20 to the full length of an exemplary 
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sequence of the invention, e.g., SEQ ID NO:l, SEQ ID NO:2. SEQ ID NO:3. SEQ ID NO:4. 
SEQ ID NO:5, SEQ ID N0:6, SEQ ID NO:7, SEQ ID NO:8. etc.. are compared to a 
reference sequence of the same number of contiguous positions after the two sequences are 
optimally aUgned. If the reference sequence has the requisite sequence identity to an 
exemplary sequence of the mvention, e.g., 50%, 51%, 52%. 53%, 54%, 55%, 56%, 57%, 
58% 59%. 60%. 61%, 620/0. 63%, 64%, 65%. 66%. 67o/o. 68%. 69%, 10%, 71%, 72%, 73o/o. 
740/0! 750/0. 760/0. 770/0. 780/0. 790/0. 8O0/0. 8I0/0. 820/0. 830/0, 84o/o, 85%. 86%. 87%. 88%. 89o/o. 
90%' 910/0' 920/0. 930/0, 940/0, 950/0. 960/0, 970/0, 980/0. 990/0. or more sequence identity to a 
sequence of the invention, e.g.. SEQ ID N0:1. SEQ ID NO:2. SEQ ID NO:3. SEQ ID NO:4. 
SEQ ID NO:5. SEQ ID NO:6. SEQ ID NO:7. SEQ ID NO:8. etc.. that sequence is within the 
scope of the invention. In alternative embodiments, subsequences ranging from about 20 to 
600. about 50 to 200, and about 100 to 150 are compared to a reference sequence of the same 
number of contiguous positions after the two sequences are optimally aUgned. Methods of 
aligmnent of sequence for comparison are well-known in the art. Optimal aligmnent of 
sequences for comparison can be conducted, e.g.,bythelocalhomology algorithm of Smith 
& Waterman, Adv. Appl. Math. 2:482. 1981. by the homology alignment algorithm of 
Needleman & Wunsch. J. Mol. Biol. 48:443, 1970, by the search for similarity method of 
person & Lipman. Proc. Natl. Acad. Sci. USA 85:2444, 1988. by computerized 
implementations of these algorithms (GAP, BESTFIT. FASTA. and TFASTA in the Wisconsin 
Genetics Software Package. Genetics Computer Group, 575 Science Dr., Madison. WD. or by 
manual aUgnment and visual inspection. Ottier algorithms for determining homology or 
identity include, for example, in addition to a BLAST program (Basic Local AUgnment 
Search Tool at the National Center for Biological Information). ALIGN, AMAS (Analysis of 
Multiply Aligned Sequences). AMPS (Protein Multiple Sequence AUgnment). ASSET 
(AUgned Segment Statistical Evaluation Tool). BANDS. BESTSCOR, BIOSCAN (Biological 
Sequence Comparative Analysis Node). BLB^S (BLocks IMProved Seareher). FASTA, 
Intervals & Points. BMB. CLUSTAL V. CLUSTALW. CONSENSUS. LCONSENSUS, 
WCONSENSUS. Smith-Waterman algorithm, DARWIN. Las Vegas algorithm, FNAT 
(Forced Nucleotide AUgmnent Tool), FrameaUgn. Framesearch. DYNAMIC. FILTER, FSAP 
(Fristensky Sequence Analysis Package). GAP (Global AUgmnent Program). GENAL. 
GIBBS. GenQuest. ISSC (Sensitive Sequence Comparison). LALIGN (Local Sequence 
AUgmJent). LCP (Local Content Program), MACAW (Multiple AUgnment Construction & 
Analysis Workbench). MAP (Multiple AUgimient Program), MBLKP. MBLKN. PIMA 
(Pattern-Induced Multi-sequence AUgmnent), SAGA (Sequence AUgmnent by Genetic 
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Algorithm) and WHAT-IF. Such aUgmnent programs can also be used to screen genome 
databases to identify polynucleotide sequences having substantially identical sequences. A 
number of genome databases are available, for example, a substantial portion of the human 
genome is available as part of the Human Genome Sequencing Project (Gibbs. 1995). 
Several genomes have been sequenced, e.g., M. genUalium (Fraser et al., 1995). M. 
jannascM (Bult et al.. 1996). H. influenzae (Fleischmaim et al.. 1995). E, coU (Blattner et al.. 
1997). and yeast (S. cerevisiae) (Mewes et al.. 1997). and D. melanogaster (Adams et al.. 
2000). Significant progress has also been made in sequencing the genomes of model 
organism, such as mouse, C. elegans. mdArabadopsis sp. Databases containing genomic 
information amiotated with some fimctional information are maintained by different 
organization, and are accessible via the internet. 

BLAST, BLAST 2.0 and BLAST 2.2.2 algorithms are also used to practice the 
invention. Thoy are described, e.g., in Altschul (1977) Nuc. Acids Res. 25:3389-3402; 
Altschul (1990) J. Mol. Biol. 215:403-410. Software for performing BLAST analyses is 
pubUcly available through the National Center for Biotechnology Information. This 
algorithm involves first identifying high scoring sequence pairs (HSPs) by identifying short 
words of length W in the query sequence, which either match or satisfy some positive-valued 
threshold score T when aUgned with a word of the same length in a database sequence. T is 
referred to as the neighborhood word score threshold (Altschul (1990) supra). These initial 
neighborhood word hits act as seeds for initiating searches to find longer HSPs containing 
them. The word hits are extended in both directions along each sequence for as far as the 
cmnulative aHgnment score can be increased. Cumulative scores are calculated using, for 
nucleotide sequences, the parameters M (reward score for a pair of matching residues; always 
>0). For amiao acid sequences, a scoring matrix is used to calculate the cumulative score. 
Extension of the word hits in each direction are halted when: the cumulative aligmnent score 
fells off by the quantity X from its maximum achieved value; the cumulative score goes to 
zero or below, due to the accumulation of one or more negative-scoring residue alignments; 
or the end of either sequence is reached. Tlie BLAST algorithm parameters W, T, and X 
determine the sensitivity and speed of the alignment. The BLASTN program (for nucleotide 
, sequences) uses as defaults a wordlength (W) of 11, an expectation (E) of 10, M=5, N=-4 and 
a comparison of both strands. For amino acid sequences, the BLASTP program uses as 
defealts a wordlength of 3. and expectations (E) of 1 0, and the BLOSUM62 scoring matnx 
(seeHenikoff &Henikofif (1989)Proc. Natl. Acad. Sci. USA 89:10915) aUgnments (B) of 50, 
expectation (E) of 10, M=5. N= -4. and a comparison of both strands. Tlie BLAST algorithm 
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also performs a statistical analysis of the siinilaiity between two sequences (see, e.g.. Karlin 
&Altschul(1993)Proc. Natl. Acad. Sci. USA 90:5873). One measure of similarity provided 
by BLAST algorithm is the smallest sum probability (P(N)), which provides an indication of 
the probabiUty by which a m^tch between two nucleotide or amino acid sequences would 
occur by chance. For example, a nucleic acid is considered similar to a references sequence 
if the smallest sum piobabiUty in a comparison of the test nucleic acid to the reference 
nucleic acid is less llian about 0.2, more preferably less than about 0.01. and most preferably 
less than about 0.001. In one aspect, protein and nucleic acid sequence homologies are 
evaluated using the Basic Local Aligmnent Search Tool CBLAST")- For example, five 
specific BLAST programs can be used to perform the following task: (1) BLAST? and 
BLASTS compare an amino acid query sequence against a protein sequence database; (2) 
BLASTN compares a nucleotide query sequence against a nucleotide sequence database; (3) 
BLASTX compares the six-frame conceptual translation products of a query nucleotide 
sequence (both strands) against a protein sequence database; (4) IBLASTN compares a 
query protein sequence against anucleotide sequence database translated in all six readmg 
frames (both strands); and. (5) TBLASTX compares the six-frame tiranslations of a 
nucleotide query sequence against the six-frame translations of a nucleotide sequence 
database. The BLAST programs identify homologous sequences by identifying similar 
segments, which are referred to herein as 'High-scoring segment pairs," between a query 
amino or nucleic acid sequence and a test sequence which is preferably obtained from a 
protein or nucleic acid sequence database. High-scoring segment pairs are preferably 
identified (i.e.. aUgned) by means of a scoring matirix. many of which are known in the art 
Preferably, the scoring matrix used is tiie BLOSUM62 matrix (Gomiet et aL, Science 
256:1443-1445, 1992; Henikoff and Henikoff. Proteins 17:49-61. 1993). Less preferably, tiie 
PAM or PAM250 matrices may also be used (see. e.g., Schwartz and Dayhoff. eds., 1978. 
Matrices for Detecting Distance Relationships: Atlas of Protein Sequence and Stmcture. 
Washington: National Biomedical Research Foundation). 

In one aspect of the invention, to determine if a nucleic acid has the requisite 
sequence identity to be within tiie scope of the invention, tiie NCBI BLAST 2.2.2 programs is 
used, default options to blastp. There are about 38 setting options in tiie BLAST 2.2.2 
program In tiiis exemplary aspect of tiie invention, all default values are used except for tiie 
defeult filtering setting (i.e.. all parameters set to default except filtering which is set to OFF); 
in its place a "-F F" setting is used, which disables filtering. Use of default filtering often 

results in Karlin-Altschul violations due to short lengtti of sequence. 

46 



09010-094001 A 

US 



wo 03/089620 k » PCT/US03/12556 

The default values used in this exemplary aspect of the invention include: 
"Filter for low complexity: ON 

> Word Size: 3 

> Matrix: Blosum62 
>Giq? Costs: Existence: 11 

> Bctension:l" 

Other default settings are: filter for low complexity OFF. word size of 3 for 
protein, BLOSUM62 matrix, existence penalty of -11 and a gap extension penalty of -1. 

An exemplary NCBI BLAST 2.2.2 program setting is set forth in Example 1, 
below. Note that the "-W" option defaults to 0. Hiis means that, if not set, the word size 
defaults to 3 for proteins and 11 for nucleotides. 

rnm puter systems and c omputer program products 

To determine and identify sequence identities, structural homologies, motifs 
and the like in silico the sequence of the invention can be stored, recorded, and manipulated 
on any medium which can be read and accessed by a computer. Accordingly, the invention 
provides computers, computer systems, computer readable mediums, computer programs 
products and the like recorded or stored thereon the nucleic acid and polypeptide sequences 
of the mvention. e.g., an exemplary sequence of the mvention, e.g., SBQ ID NO:l. SEQ ID 
NO-2. SEQ ID NO:3. SEQ ID NO:4. SEQ ID NO:5, SEQ ID NO:6, SEQ ID NO:7, SEQ ID 
NO-8, etc. As used herein, the words '^recorded" and "stored" refer to a process for storing 
information on a computer medium. A skilled artisan can readily adopt any known methods 
for recording information on a computer readable medium to generate manufactures 
comprising one or more of the nucleic acid and/or polypeptide sequences of the mvention. 

Another aspect of the invention is a computer readable medium having 
recorded thereon at least one nucleic acid and/or polypeptide sequence of the invention. 
Computer readable media include magnetically readable media, opticaUy readable media, 
electronically readable media and magnetic/optical media. For example, the computer 
readablemediamay beahaid disk, afloppy disk, amagnetic tape. CD-ROM, Digital 
Versatile Disk (DVD). Random Access Memory (RAM), or Read Only Memory (ROM) as 
well as other types of other media known to those skilled in the art. 

Aspects of the invention mclude systems (e.g., internet based systems), 
particularly computer systems, which store and manipulate the sequences and sequence 
information described herem. One example of a computer system 100 is illustrated in block 
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diagram form in Figure 1. As used herein, "a computer system" refers to the hardware 
components, software components, and data storage components used to analyze a nucleotide 
or polypeptide sequence of the invention. The computer system 100 can include a processor 
for processing, accessing and manipulating the sequence data. The processor 105 can be any 
well-known type of central processmg unit, such as, for example, the Pentium JR from Intel 
Corporation, or similar processor from Sun. Motorola, Conq,aq. AMD or Mtemational 
Business Machines. The computer system 100 is a general purpose system that comprises the 
processor 105 and one or more internal data storage components 110 for storing data, and one 
or more data retrieving devices for retrieving the data stored on the data storage components. 
A skilled artisan can readily appreciate that any one of the currently available computer 

systCTis are suitable. 

In one aspect, the computer system 100 mcludes a processor 105 connected to 
a bus which is comiected to a main memory 115 (preferably implemented as RAM) and one 
or more internal data storage devices 11 0, such as a hard drive and/or other computer 
readable media having data recorded thereon. The computer system 100 can farther include 
one or more data retrieving device 118 for reading the data stored on the internal data storage 
devices 110. 

The data retrieving device 118 may represent, for example, a floppy disk 
drive, a compact disk drive, a magnetic tape drive, or a modem capable of connection to a 
remote data storage system (e.g.. via the internet) etc. In some embodiments, the internal 
data storage device 110 is a removable computer readable medium such as a floppy disk, a 
compact disk, a magnetic tape, etc. containing control logic and/or data recorded thereon. 
The computer system 1 00 may advantageously include or be programmed by ^ropriate 
software for reading the control logic and/or the data from the data storage component once 

inserted in the data retrieving device. 

The computer system 100 includes a display 120 which is used to display 
output to a computer user. It should also be noted that the computer system 100 can be linked 
to other computer systems 125a-c in a network or wide area network to provide centralized 
access to the computer system 100. Software for accessing and processing the nucleotide or 
amino acid sequences of the invention can reside in main memory 115 during execution. 

In some aspects, the computer system 100 may fiiriher comprise a sequence 
comparison algorithm for comparing a nucleic acid sequence of the invention. The algorithm 
and sequenceCs) can be stored on a computer readable medium. A "sequence comparison 
algorithm" refers to one or more programs which are implemented aocally or remotely) on 
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the computer system 100 to compare a nucleotide sequence with other nucleotide sequences 
and/or compounds stored within a data storage means. For example, the sequence 
comparison algorithm may compare the nucleotide sequences of an exemplary sequence, e.g., 
SEQ ID NO:l, SEQ ID NO:2, SEQ ID NO:3. SEQ ID NO:4, SEQ ID NO:5. SEQ ID NO:6, 
SEQ ID N0:7, SEQ ID NO:8, etc. stored on a computer readable medium to reference 
sequences stored on a computer readable medium to identify homologies or structural motifs. 

The parameters used with the above algorithms may be adapted depending on 
the sequence length and degree of homology studied. M some aspects, the parameters may 
be the default parameters used by flie algorithms in the absence of instructions from the user. 
Figure 2 is a flow diagram illustrating one aspect of a process 200 for comparing a new 
nucleotide or protein sequence with a database of sequences in order to determine the 
homology levels between the new sequence and tiie sequences in the database. The database 
of sequences can be a private database stored witiiin the computer system 100, or apubUc 
database such as GENBANK that is available through the Internet. The process 200 begins at 
a start state 201 and then moves to a state 202 wherein the new sequence to be compared is 
stored to a memory in a computer system 100. As discussed above, die memory could be any 
type of memory, including RAM or an internal storage device. 

The process 200 then moves to a state 204 wherein a database of sequences is 
opened for analysis and comparison. THe process 200 then moves to a state 206 wherein the 
first sequence stored in the database is read into a memory on the computer. A comparison is 
liien performed at a state 210 to determine if the first sequence is the same as the second 
sequence. It is important to note that this step is not limited to performing an exact 
comparison between the new sequence and the first sequence in the database. Well-known 
methods are known to those of skiU in the art for comparing two nucleotide orprotein 
sequences, even if they are not identical. For example, gaps can be introduced into one 
sequence in order to raise the homology level between the two tested sequences. The 
parameters that control whether gaps or other features are introduced into a sequence during 
comparison are normally entered by the user of the computer system. 

Once a comparison of the two sequences has been performed at the state 210, 
a determination is made at a decision state 210 whether the two sequences are the same. Of 
course, the term "same" is not limited to sequences that are absolutely identical. Sequences 
that are within the homology parameters entered by the user wiU be marked as "same" in the 
process 200. If a determination is made that the two sequences are the same, the process 200 
moves to a state 214 wherein the name of the sequence firom the database is displayed to the 
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user. This state notifies the user that the sequence with the displayed name fulfills the 
homology constraints that were entered. Once the name of the stored sequence is displayed 
to the user, the process 200 moves to a decision state 218 wherein a detennination is made 
whether more sequences exist in the database. If no more sequences exist in the database, 
ihen the process 200 terminates at an end state 220. However, if more sequences do exist in 
the database, then the process 200 moves to a state 224 wherein a pointer is moved to the 
next sequence in the database so that it can be compared to the new sequence, hi this manner, 
the new sequence is aUgned and compared with every sequence in the database. 

It should be noted that if a detennination had been made at the decision state 
212 that tiie sequences were not homologous, then the process 200 would move immediately 
to the decision state 218 in order to determine if any other sequences were available m the 
database for comparison. Accordingly, one aspect of the invention is a computer system 
comprising a processor, a data storage device having stored thereon a nucleic acid sequence 
of the invention and a sequence comparer for conducting the comparison. The sequence 
comparer may indicate a homology level between the sequences compared or identify 
structural motifs, or it may identify structural motifs in sequences which are compared to 
these nucleic acid codes and polypeptide codes. 

Figure 3 is a flow diagram illustrating one embodiment of a process 250 in a 
computer for determining whether two sequences are homologous. The process 250 begins at 
a start state 252 and then moves to a state 254 wherein a first sequence to be compared is 
stored to a memory. The second sequence to be compared is then stored to a memory at a 
state 256. The process 250 then moves to a state 260 wherein the first character in the first 
sequence is read and then to a state 262 wherem the first character of the second sequence is 
read. It should be understood that if the sequence is a nucleotide sequence, then the character 
would normally be either A, T, C, G or U. If the sequence is a protein sequence, then it can 
be a single letter amino acid code so that the first and sequence sequences canbe easily 
compared. A determination is then made at a decision state 264 whether the two characters 
are the same. If they are the same, then the process 250 moves to a state 268 wherein the 
next characters in the first and second sequences are read. A determination is then made 
whether the next characters are the same. If they are, then the process 250 continues this loop 
until two characters are not the same. If a determination is made that the next two characters 
are not the same, the process 250 moves to a decision state 274 to determine whether there 
are any more characters either sequence to read. If there are not any more characters to read, 
then the process 250 moves to a state 276 wherein the level of homology between the first 
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and second sequences is displayed to the user. Hie level of homology is determined by 
calculating the proportion of characters between the sequences that were the same out of the 
total number of sequences in the first sequence. Thus, if every character in a first 100 
nucleotide sequence aUgned with a every character in a second sequence, the homology level 
would be 100%. 

Alternatively, the computer program can compare a reference sequence to a 
sequence of tiie invention to detennine whether the sequences differ at one or more positions. 
The program can record liie length and identity of inserted, deleted or substituted nucleotides 
or amino acid residues with respect to the sequence of either the reference or the invention. 
The computer program may be a program which determines whether a reference sequence 
contains a single nucleotide polymorphism (SNP) with respect to a sequence of the invention, 
or whether a sequence of the invention comprises a SNP of a known sequence. Thus, in 
some aspects, the computer program is a program which identifies SNPs. The method may 
be implemented by the computer systems described above and the method illustrated in 
Figure 3 . The method can be performed by reading a sequence of the invention and the 
reference sequences through the use of the computer program and identifying differences 

with the computer program. 

In other aspects the computer based system comprises an identifier for 
identifying features within a nucleic acid or polypeptide of the invention. An "identifier" 
refers to one or more programs which identifies certain features within a nucleic acid 
sequence. For example, an identifier may comprise a program which identifies an open 
reading frame (ORF) m a nucleic acid sequence. Figure 4 is a flow diagram illustrating one 
aspect of an identifier process 300 for detecting the presence of a feature in a sequence. The 
process 300 begins at a start state 302 and then moves to a state 304 wherein a first sequence 
that is to be checked for features is stored to a memory 115 in the computer system 100. The 
process 300 then moves to a state 306 wherein a database of sequence features is opened. 
Such a database would include a Ust of each feature's attributes along with the name of the 
feature. For example, a feature name could be "Initiation Codon" and the attribute would be 
"ATG". Anotiier example would be the feature name "TAATAABox" and the featiire 
attribute would be 'TAATAA". An example of such a database is produced by the University 
of Wisconsin Genetics Computer Group. Alternatively, the featiires may be structinral 
polypeptide motifs such as alpha heUces, beta sheets, or fimctional polypeptide motifs such as 
enzymatic active sites, helix-tiim-helix motife or other motife known to those skilled in the 

art Once the database of features is opened at the state 306. the process 300 moves to a state 
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308 wherein the first feature is read from the database. Aconqiarison of the attribute of the 
first feature with the first sequence is then made at a state 310. A determination is liien made 
at a decision state 316 whether the attribute of the feature was found in the first sequence. If 
me attribute was found, then the process 300 moves to a state 318 wherein the name of the 
found feature is displayed to the user. The process 300 then moves to a decision state 320 
wherein a determination is made whether move features exist in the database. If no more 
features do exist, then the process 300 teiminates at an end state 324. However, if more 
features do exist in the database, then the process 300 reads the next sequence feature at a 
state 326 and loops back to the state 310 wherein the attribute of the next feature is compared 
against the first sequence. If the feature attribute is not found in the first sequence at the 
decision state 316. the process 300 moves directly to the decision state 320 in order to 
determine if any more features exist in the database. Thus, in one aspect, the invention 
provides a computer program that identifies open reading frames (ORFs). 

A polypeptide or nucleic acid sequence of the invention may be stored and 
manipulated in a variety of data processor programs in a variety of formats. For example, a 
sequence can be stored as text in a word processing file, such as MicrosoflWORD or 
WORDPERFECT or as an ASCH file in a variety of database programs familiar to those of 
skill in the art. such as DB2. SYBASE, or ORACLE. In addition, many computer programs 
and databases maybe used as sequence comparison algorithms, identifiers, or sources of 
reference nucleotide sequences or polypeptide sequences to be compared to a nucleic acid 
sequence of the invention. The programs and databases used to practice the invention 
include, but are not limited to: MacPattem (EMBL), DiscoveryBase (Molecular AppUcations 
Group), GeneMine (Molecular AppUcations Group). Look (Molecular AppUcations Group), 
MacLo'ok (Molecular AppUcations Group). BLAST and BLAST2 (NCBI), BLASTOT and 
BLASTX (Altschul et al, J. Mol. Biol. 215: 403. 1990). FASTA (Pearson and Lipman. Proc. 
Natl. Acad. Sci. USA, 85: 2444, 1988). FASTDB (Brutlag et al. Comp. App. Biosci. 6:237- 
245. 1990), Catalyst (Molecular Simulations Inc.). Catalyst/SHAPE (Molecular Simulations 
mc). Cerius2.DBAccess (Molecular Simulations Uic), HypoGen (Molecular Shnulations 
mc). Insight n. (Molecular Simulations Inc.), Discover (Molecular Simulations Inc.), 
CHARMm (Molecular Simulations Inc.). FeUx (Molecular Simulations Inc.), DelPhi, 
(Molecular Simulations Inc.), QuanteMM, (Molecular Simulations Inc.), Homology 
(Molecular Simulations Inc.), Modeler (Molecular Simulations Inc.), ISIS (Molecular 
Simulations Inc.). Quanta/Protein Design (Molecular Simulations Inc.), WebLab (Molecular 
Simulations Inc.). WebLab Diversity Explorer (Molecular Simulations Inc.). Gene Explorer 
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(Molecular Simulations Inc.), SeqFold (Molecular Simulations Inc.), the MDL Available 
Chemicals Directory database, the MDL Drug Data Report data base, the Comprehensive 
Medicinal Chemistry database. Derwent's World Drug Index database, the 
BioByteMasterFile database, the Genbank database, and the Genseqn database. Many other 
programs and data bases would be ^parent to one of skill in the art given the present 
disclosure. 

Motifs which may be detected using the above programs include sequences 
encodmg leucine zippers, helix-tum-helix motife. glycosylation sites, ubiquitination sites, 
alpha heUces, and beta sheets, signal sequences encoding signal peptides which direct the 
secretion of the encoded proteins, sequences implicated in tiranscription regulation such as 
homeoboxes, acidic stretches, enzymatic active sites, substrate binding sites, and enzymatic 
cleavage sites. 

TTyhriHiy ation of nuc leic acids 

The invention provides isolated or recombmant nucleic acids that hybridize 
under stringent conditions to an exemplary sequence of the invention, e.g., a sequence as set 
forth in SEQ ID NO:l. SEQ ID N0:3. SEQ ID NO:5, SEQ ID NO:7, SEQ ID NO:9, SEQ ID 
NO U, SEQ ID NO:13. SEQ ID NO:15. SEQ ID NO:17. SEQ ID NO:19. SEQ ID NO:21, 
SEQ ID NO:23, SEQ ID NO:25. SEQ ID NO:27. SEQ ID NO:29, SEQ ID NO:31. SEQ ID 
NO-33, SEQ ID NO:35, SEQ ID NO:37, SEQ ID NO:39, SEQ ID NO:41, SEQ ID NO:43. 
SEQ ID NO:45, SEQ ID NO:47. SEQ ID NO:49, SEQ ID NO:51, SEQ ID NO:53. SEQ ID 
NO:55, SEQ ID NO:57. SEQ ID NO:59. SEQ ID NO:61, SEQ ID NO:63, SEQ ID NO:65. 
SEQ ID NO:67, SEQ ID NO:69, SEQ ID NO:71, SEQ ID NO:73, SEQ ID NO:75. SEQ ID 
NO-77, SEQ ID NO:79, SEQ ID NO:81. SEQ ID NO:83, SEQ ID NO:85. SEQ ID NO:87. 
SEQ ID NO:89, SEQ ID NO:91, SEQ ID NO:93, SEQ ID NO:95, SEQ ID NO:97, SEQ ID 
NO:99. SEQ ID NO:101. SEQ ID NO:103, SEQ ID NO:105, or a nucleic acid that encodes a 
polypeptide comprising a sequence as set forth in SEQ ID NO:2, SEQ ID NO:4, SEQ ID 
NO:6. SEQ ID NO:8. SEQ ID NO:10, SEQ ID NO:12, SEQ ID NO:14, SEQ ID NO:16, SEQ 
ID NO:18. SEQ ID NO:20. SEQ ID NO:22, SEQ ID NO:24, SEQ ID NO:26, SEQ ID NO:28, 
SEQ ID NO:30, SEQ ID NO:32, SEQ ID NO:34. SEQ ID NO:36, SEQ ID NO:38, SEQ ID 
NO:40. SEQ ID NO:42, SEQ ID NO:44. SEQ ID NO:46, SEQ ID NO:48, SEQ ID NO:50, 
SEQ ID NO:52, SEQ ID NO:54. SEQ ID NO:56, SEQ ID NO:58, SEQ ID NO:60, SEQ ID 
NO:62, SEQ ID NO:64, SEQ ID NO:66, SEQ ID NO:68. SEQ ID NO:70, SEQ ID NO:72. 
SEQ ID NO:74, SEQ ID NO:76, SEQ ID NO:78. SEQ ID NO:80. SEQ ID NO:82. SEQ ID 
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NO:84, SEQ ID NO:86. SEQ ID NO:88, SEQ ID NO:90, SEQ ID NO:92. SEQ ID NO:94. 
SEQ ID NO:96, SEQ ID NO:98, SEQ ID NO:100. SEQ ID NO:102, SEQ ID NO:104, SEQ 
ID NO:106. The stringent conditions can be highly stringent conditions, medium stringent 
conditions, low stringent conditions, including die high and reduced stringency conditions 
5 described herein. In alternative embodiments, nucleic acids of the invention as defined by 
their ability to hybridize under stringent conditions can be between about five residues and 
the full length of the molecule. e.g., an exetiq)lary nucleic acid of the invention. For example, 
they canbe at least 5. 10. 15. 20. 25. 30. 35. 40. 50. 55. 60. 65. 70. 75. 80, 90, 100, 150. 200. 
250, 300, 350. 400 residues in length. Nucleic acids shorter than full length are also 
10 included.' Thesenucleicacidsareusefulas,e.g.,hybridizationprobes,labelingpiobes,PCR 
oUgonucleotide probes, iRNA (single or double stranded), antisense or sequences encoding 
antibody binding peptides (epitopes), motife, active sites and the like. 

In one aspect, nucleic acids of the invention are defined by their abiUty to 
hybridize under high stringency comprises conditions of about 50% foimamide at about 37°C 
15 to42°C. hioneaspect,nucleicacidsoftheinventionaredefinedbytheirabiUtytohybridize 
under reduced stringency comprising conditions in about 35% to 25% foimamide at about 
30°C to 35°C. Alternatively, nucleic acids of the invention are defined by their abiUty to 
hybridize under high stringency comprising conditions at 42°C in 50% formamide, 5X SSPE. 
0.3% SDS. and a repetitive sequence blocking nucleic acid, such as cot-1 or sahnon sperm 
20 DNA (e.g., 200 n/ml sheared and denatured sahnon sperm DNA). In one aspect, nucleic 
acids of the invention are defined by their abiUty to hybridize under reduced stringency 
conditions conq)Tising 35% foimamide at a reduced temperature of 35°C. 

Following hybridization, the filter may be washed with 6X SSC, 0.5% SDS at 
50°C. These conditions are considered to be ''moderate" conditions above 25% formamide 
25 and "low" conditions below 25% formamide. A specific example of "moderate- 
hybridization conditions is when the above hybridization is conducted at 30% formamide. A 
specific example of "low stringency hybridization conditions is when the above 
hybridization is conducted at 10% formamide. 

The temperature range correspondmg to a particular level of stringency can be 
30 fiirther narrowed by calculating the purine to pyrimidine ratio of the nucleic acid of interest 
and adjusting the temperature accordingly. Nucleic acids of the invention are also defined by 
their abiUty to hybridize under high, medium, and low stringency conditions as set forth in 
Ausiibel and Sambrook. Variations on the above ranges and conditions are weU known in the 

art. Hybridization conditions are discussed fiirther, below. 
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Oli poimcleotides probes ^r^A methods for using them 

The invention also provides nucleic acid probes for identifying nucleic acids 
encoding a polypeptide with a phospholipase activity. In one aspect, the probe comprises at 
least 10 consecutive bases of a sequence as set forth in SEQ ID NO:l, SEQ ID NO:3, SEQ ID 
NO:5, SEQ ID NO:7, SEQ ID NO:9, SEQ ID NO:ll. SEQ ID NO:13, SEQ ID NO:15, SEQ 
ID NO:17, SEQ ID NO:19, SEQ ID NO:21, SEQ ID NO:23, SEQ ID NO:25. SEQ ID NO:27, 
SEQ ID NO:29. SEQ ID NO:31. SEQ ID NO:33. SEQ ID NO:35, SEQ ID NO:37, SEQ ID 
NO:39. SEQ ID NO:41, SEQ ID NO:43. SEQ ID NO:45, SEQ ID NO:47, SEQ ID NO:49, 
SEQ ID N0:51, SEQ ID NO:53, SEQ ID NO:55. SEQ ID NO:57, SEQ ID NO:59. SEQ ID 
NO:61, SEQ ID NO:63. SEQ ID NO:65, SEQ ID NO:67, SEQ ID NO:69, SEQ ID NO:71, 
SEQ ID NO:73, SEQ ID NO:75. SEQ ID NO:77, SEQ ID NO:79, SEQ ID NO:81, SEQ ID 
NO:83, SEQ ID NO:85, SEQ ID NO:87. SEQ ID NO:89, SEQ ID N0:91, SEQ ID NO:93, 
SEQ ID NO:95, SEQ ID NO:97, SEQ ID NO:99, SEQ ID NO:101. SEQ ID NO:103, SEQ ID 
NO105. Alternatively, aprobe of the invention can be at least about 5, 6, 7, 8, 9, 10, 11, 12, 
13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23. 24, 25. 30, 35, 40, 50. 55. 60, 65, 70, 75, 80. 90, 
100, 150, about 10 to 50, about 20 to 60 about 30 to 70, consecutive bases of a sequence as 
set forth in a sequence of the invention. The probes identify a nucleic acid by binding or 
hybridization. The probes can be used in arrays of the invention, see discussion below, 
including, e.g., capillary arrays. The probes of the invention can also be used to isolate other 

nucleic acids or polypeptides. 

The probes of the invention can be used to determine whether a biological 
sample, such as a soil sample, contains an organism having a nucleic acid sequence of the 
invention or an organism from which the nucleic acid was obtained. In such procedures, a 
biological sample potentially harboring the organism from which the nucleic acid was 
isolated is obtained and nucleic adds are obtained from the sample. The nucleic acids are 
contacted with the probe under conditions which permit the probe to specifically hybridize to 
any complementary sequences present in the sample. Where necessary, conditions which 
permit the probe to specifically hybridize to complementary sequences may be determined by 
placing tiie probe in contact with complementary sequences from samples known to contain 
the complementary sequence, as well as control sequences which do not contain the 
complementary sequence. Hybridization conditions, such as the salt concentration of tiie 
hybridization buffer, the formamide concentration of the hybridization buffer, or the 
hybridization temperature, may be varied to identify conditions which allow the probe to 
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hybridize specificaUy to complementary nucleic acids (see discussion on specific 

hybridization conditions). 

If the sample contains the organism from which the .nucleic acid was isolated, 
specific hybridization of the probe is then detected. Hybridization may be detected by 
labeling the probe with a detectable agent such as a radioactive isotope, a fluorescent dye or 
an enzyme capable of catalyzing the formation of a detectable product. Many methods for 
usmg the labeled probes to detect the presence of complementary nucleic acids in a sample 
are famiUar to those skilled in the art. These include Southern Blots, Northern Blots, colony 
hybridization procedures, and dot blots. Protocols for each of these procedures are provided 

in Ausubel and Sambrook. 

Alternatively, more than one probe (at least one of which is capable of 
specifically hybridizing to any complementary sequences which are present in the nucleic 
acid sample), may be used in an amplification reaction to determine whether the sample 
contains an organism containing a nucleic acid sequence of the invention (e.g., an organism 
from which the nucleic acid was isolated), hi one aspect, the probes comprise 
oUgonucleotides. hi one aspect, the amplification reaction may comprise a PGR reaction. 
PGR protocols are described in Ausubel and Sambrook (see discussion on amphfication 
reactions). In such procedures, the nucleic acids in the sample are contacted with the probes, 
the anqjUfication reaction is performed, and any resulting amphfication product is detected. 
The amplification product may be detected by performing gel electrophoresis on the reaction 
products and stainmg the gel with an intercalator such as ethidium bromide. Alternatively, 
one or more of the probes may be labeled with a radioactive isotope and the presence of a 
radioactive amphfication product may be detected by autoradiography after gel 
electrophoresis. 

Probes derived from sequences near the 3' or 5' ends of a nucleic acid 
sequence of the mvention can also be used m chromosome waBdng procedures to identify 
clones contaming additional, e.g., genomic sequences. Such methods allow the isolation of 
genes which encode additional proteins of mterest from the host organism. 

hi one aspect, nucleic acid sequences of the mvention are used as probes to 
identify and isolate related nucleic acids, hi some aspects, the so-identified related nucleic 
acids maybe cDNAs or genomic DNAs from organisms other than the one from which the 
nucleic add of the invention was first isolated. In such procedures, a nucleic acid sample is 
contacted with the probe under conditions which permit the probe to specificaUy hybridize to 
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related sequences. Hybridization of the probe to nucleic acids from the related oi^sm is 
then detected using any of the methods described above. 

In nucleic acid hybridization reactions, the conditions used to achieve a 
particular level of stringency will vary, dependmg on the nature of the nucleic acids being 
hybridized. For example, the length, degree of complementarity, nucleotide sequence 
composition (e.g.. C3C v. AT content), and nucleic acid type (e.g.. RNA v. DNA) of the 
hybridizing regions of the nucleic acids can be considered in selecting hybridization 
conditions. An additional consideration is whether one ofthe nucleic acids is immobilized, 
for example, on a filter. Hybridization may be carried out under conditions of low 
stringency, moderate stringency or high stringency. As an example of nucleic acid 
hybridization, a polymer membrane containing immobilized denatured nucleic acids is first 
prehybridized for 30 minutes at 45»C in a solution consisting of 0.9 M NaCl, 50 mM 
NaH2P04, pH 7.0, 5.0 mMNa2EDTA, 0.5% SDS, lOXDenhardfs. and 0.5 mg/ml 
polyriboadenyUc acid. Approximately 2 X 107 cpm (specific activity 4-9 X 108 cpmAxg) of 

end-labeled oUgonucleotide probe are then added to the solution. After 12-16 hours of 
incubation, the membrane is washed for 30 minutes at room temperature (RT) in IX SET 
(150 mM NaCl, 20 mM Tris hydrochloride, pH 7.8, 1 mM Na2EDTA) containing 0.5% SDS, 
foUowed by a 30 minute wash in fresh IX SET at Tm-lO-C for the oligonucleotide probe. 
The membrane is then exposed to auto-radiographic fihn for detection of hybridization 
signals. 

By varying the stringency ofthe hybridization conditions used to identify 
nucleic acids, such as cDNAs or genomic DNAs. which hybridize to the detectable probe, 
nucleic acids havmg different levels of homology to the probe canbe identified and isolated. 
Stringency may be varied by conducting the hybridization at varying temperatures below the 
meltmg temperatures ofthe probes. The melting temperature, Tm, is the temperature (under 
defined ionic strength and pH) at which 50o/. ofthe target sequence hybridizes to a perfectly 
complementary probe. Very stringent conditions are selected to be equal to or about 5^C 
lower than the Tm for a particular probe. The melting temperature of the probe may be 
calculated using the foUowing exemplary formulas. For probes between 14 and 70 
nucleotides in length the melting temperature (Tm) is calculated usmg the formula: 
Tm=81.5+16.6aog [Na+])-K).41 (fraction G+C)-(600/N) where N is the length ofthe probe. 
If the hybridization is carried out in a solution containing formamide, the melting temperature 
may be calculated using the equation: Tm=81.5+16.6aog [Na+]>K).41(fraction G4-CH0.63% 
foimamide)-(600/N) whereN is the length ofthe probe. Prehybridizationmay be carried out 
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in 6X SSC, 5X Denhardfs reagent, 0.5% SDS, lOOfig denatured fragmented salmon sperm 
DNA or 6X SSC, 5X Denhardfs reagent, 0.5% SDS, lOOtig denatured fragmented salmon 
sperm DNA, 50% formamide. Formulas for SSC and Denhardfs and other solutions are 

listed, e.g., in Sambrook. 

Hybridization is conducted by adding the detectable probe to the 
prehybridization solutions listed above. Where the probe comprises double stranded DNA, it 
is denatured before addition to the hybridization solution. The filter is contacted with the 
hybridization solution for a sufficient period of time to allow the probe to hybridize to 
cDNAs or genomic DNAs containing sequences complementary thereto or homologous 
thereto. For probes over 200 nucleotides in length, the hybridization may be carried out at 
15-25»C below the Tm. For shorter probes, such as oUgonucleotide probes, the hybridization 
may be conducted at 5-10°C below the Tm. In one aspect, hybridizations in 6X SSC are 
conducted at approximately eS^C. In one aspect, hybridizations in 50% formamide 
containing solutions are conducted at approximately 42«>C. All of the foregoing 
hybridizations would be considered to be under conditions of high stringency. 

Following hybridization, the filter is washed to remove any non-specifically 
bound detectable probe. The stringency used to wash the filters can also be varied depending 
on the nature of the nucleic acids being hybridized, the length of the nucleic acids being 
hybridized, the degree of complementarity, the nucleotide sequence composition (e.g.. GC v. 
AT content), and the nucleic acid type (e.g., RNA v. DNA). Examples of progressively 
higher stringency condition washes are as follows: 2X SSC, 0.1% SDS at room temperature 
for 15 minutes Cow stringency); O.IX SSC. 0.5% SDS at room temperature for 30 mimxtes to 
1 hour (moderate stringency); O.IX SSC, 0.5% SDS for 15 to 30 minutes at between the 
hybridization temperature and 68'C (hi^ stringency); and 0.15M NaCl for 15 minutes at 
72°C (very high stringency). A final low stringency wash can be conducted in O.IX SSC at 
room temperature. The examples above are merely illustrative of one set of conditions that 
can be used to wash filters. One of skill in the art would know that there are numerous 
recipes for different stringency washes. 

Nucleic acids which have hybridized to the probe can be identified by 
autoradiography or other conventional techniques. The above procedure may be modified to 
identify nucleic acids having decreasing levels of homology to the probe sequence. For 
example, to obtain nucleic acids of decreasing homology to the detectable probe, less 
stringent conditions may be used. For example, the hybridization temperature may be 
decreased in increments of 5«C from eS'C to 42'C in a hybridization buffer having a Na+ 
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concentration of approximately IM. Following hybridization, the filter may be washed with 
2X SSC, 0.5% SDS at the temperature of hybridization. These conditions are considered to 
be "moderate" conditions above SO^C and "low" conditions below 50°C. An example of 
"moderate" hybridization conditions is when the above hybridization is conducted at 55«C. 
An exanq)le of "low stringency" hybridization conditions is when the above hybridization is 
conducted at 45°C. 

Alternatively, the hybridization may be carried out in buffers, such as 6X SSC, 
containing formamide at a temperature of 42«C. In this case, the concentration of formamide 
in the hybridization buffer may be reduced m 5% increments firom 50% to 0% to identify 
clones having decreasing levels of homology to the probe. Following hybridization, the filter 
may be washed with 6X SSC, 0.5% SDS at 50°C. These conditions are considered to be 
"moderate" conditions above 25% formamide and "loV conditions below 25% formamide. 
A specific example of "moderate" hybridization conditions is when the above hybridization is 
conducted at 30% fomiamide. A specific example of "low stringency" hybridization 
conditions is when the above hybridization is conducted at 10% formamide. 

These probes and methods of the invention can be used to isolate nucleic acids 
having a sequence with at least about 99%, 98%, 97%, at least 95%, at least 90%, at least 
85%, at least 80%. at least 75%, at least 70%. at least 65%. at least 60%, at least 55%, or at 
least 50% homology to a nucleic acid sequence of the invention comprising at least about 10. 
15, 20. 25, 30. 35. 40. 50. 75. 100, 150, 200. 250. 300. 350. 400, or 500 consecutive bases 
thereof, and the sequences complementary thereto. Homology may be measured using an 
ahgmnent algorithm, as discussed herein. For example, the homologous polynucleotides may 
have a coding sequence which is anaturally occurring aUeUc variant of one of the coding 
sequences described herein. Such alleUc variants may have a substitution, deletion or 
addition of one or more nucleotides when compared to nucleic adds of the invention. 

Additionally, the probes and methods of the invention may be used to isolate 
nucleic acids which encode polypeptides having at least about 99%. at least 95%. at least 
90%. at least 85%. at least 80%, at least 75%, at least 70%, at least 65o/o, at least 60%. at least 
55%, or at least 50% sequence identity (homology) to a polypeptide of the invention 
comi«ising at least 5. 10. 15. 20. 25. 30. 35. 40, 50, 75. 100. or 150 consecutive amino acids 
thereof as determined using a sequence aUgmnent algorithm (e.g., such as the FASTA version 
3.0t78 algorithm with the default parameters, or a BLAST 2.2.2 program with exemplary 
settings as set forth herein). 
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Tnhihiting Expression of PhosphoUpases 

The invention fiirther provides for nucleic acids complementary to (e.g., 
antisense sequences to) the nucleic acids of the invention, e.g., phosphoUpase-encoding 
nucleic acids. Antisense sequences are capable of inhibiting the transport, spUcing or 
transcription of phosphoUpase-encoding genes. The inhibition can be effected through the 
targeting of genomic DNA or messenger RNA. The transcription or function of targeted 
nucleic add can be inhibited, for example, by hybridization and/or cleavage. One 
particularly useful set of inhibitors provided by the present mvention mcludes 
oUgonucleotides which are able to either bind phosphoUpase gene or message, in either case 
preventing or inhibiting the production or function of phospholipase enzyme. The 
association can be though sequence specific hybridization. Another useful class of inhibitors 
includes oUgonucleotides which cause inactivation or cleavage of phosphoUpase message. 
The oligonucleotide can have enzyme activity which causes such cleavage, such as 
ribozymes. The oUgonucleotide can be chemically modified or conjugated to an enzyme or 
composition capable of cleaving the complementary nucleic acid. One may screen apool of 
many different such oUgonucleotides for those with the desired activity. 

Inhibition of phosphoUpase expression can have a variety of industrial 
appUcations. For example, inhibition of phosphoUpase expression can slow or prevent 
spoilage. Spoilage can occur when Upids or polypeptides, e.g., structural Upids or 
polypeptides, are enzymatically degraded. This can lead to the deterioration, or rot, of fruits 
and vegetables. In one aspect, use of compositions of the invention tiiat inhibit the 
expression and/or activity of phosphoUpase, e.g., antibodies, antisense oUgonucleotides, 
ribozymes and RNAi, are used to slow or prevent spoilage. Thus, in one aspect, tiie invention 
provides methods and compositions comprising ^pUcation onto a plant or plant product 
(e.g., a fruit, seed, root, lea^ etc.) antibodies, antisense oUgonucleotides, ribozymes and 
RNAi of the invention to slow or prevent spoilage. These compositions also can be 
expressed by the plant (e.g., a transgenic plant) or another organism (e.g., a bacterium or 
other microorganism transformed with a phosphoUpase gene of tiie invention). 

The compositions of the invention for the inhibition of phosphoUpase 
expression (e.g.. antisense. iRNA, ribozymes, antibodies) can be used as pharmaceutical 
compositions. 

Antisefise Oligonucleotides 
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The invention provides antisense oUgonucleotides capable of binding 
phosphoUpase message which can inhibit phosphoUpase activity by targeting mRNA. 
Strategies for designing antisense oUgonucleotides are well described in tiie scientific and 
patent Uterature, and tiie skilled artisan can design such phosphoUpase oUgonucleotides using 
the novel reagents of the invention. For example, gene walking/ RNA mapping protocols to 
screen for effective antisense oUgonucleotides are well known in the art, see, e.g., Ho (2000) 
Methods Enzymol. 314:168-183, describing an KNA mapping assay, which is based on 
standard molecular techniques to provide an easy and reUable metiiod for potent antisense 
sequence selection. See also Smith (2000) Eur. J. Pharm. Sci. 11:191-198. 

Naturally occurring nucleic acids are used as antisense oUgonucleotides. The 
antisense oUgonucleotides can be of any length; for example, in alternative aspects, the 
antisense oUgonucleotides are between about 5 to 100. about 10 to 80, about 15 to 60. about 
1 8 to 40. The optimal lengtii can be determined by routine screening. The antisense 
oUgonucleotides can be present at any concentration. The optimal concentiation can be 
determined by routine screening. A wide variety of synthetic, non-naturally occurring 
nucleotide and nucleic acid analogues are known which can address this potential problem. 
For example, peptide nucleic acids (PNAs) containing non-ionic backbones, such as N-(2- 
aminoethyl) glycine units can be used. Antisense oUgonucleotides having phosphorothioate 
Unkages can also be used, as described in WO 97/0321 1 ; WO 96/39154; Mata (1997) Toxicol 
Appl Pharmacol 144:189-197; Antisense Therapeutics, ed. Agrawal (Humana Press, Totowa, 
N.J.. 1996). Antisense oUgonucleotides having synthetic DNA backbone analogues provided 
by the invention can also include phosphoro-ditiiioate, metiiylphosphonate. phosphoramidate, 
alkyl phosphotriester. sulfemate, 3'-thioacetal. metiiylene(methylimino), 3'-N-carbamate. and 
moipholino carbamate nucleic acids, as described above. 

Combinatorial chemistry methodology can be used to create vast numbers of 
oUgonucleotides that can be rapidly screened for specific oUgonucleotides fliat have 
appropriate binding affinities and specificities toward any target, such as the sense and 
antisense phosphoUpase sequences of the invention (see, e.g.. Gold (1995) J. of Biol. Chem. 
270:13581-13584). 

Inhibitory Ribozymes 

The invention provides for with ribozymes capable of binding phosphoUpase 
message which can inHbit phosphoUpase enzyme activity by targeting mRNA. Strategies for 
designing ribozymes and selecting the phosphoUpase-specific antisense sequence for 
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targeting are weU described in the scientific and patent Uterature, and the skiUed artisan can 
design such ribozymes using the novel reagents of the invention. Ribozymes act by binding 
to a target RNA through the target RNA binding portion of a ribozyme which is held in close 
proximity to an enzymatic portion of the RNA that cleaves the target RNA. Thus, the 
ribozyme recognizes and binds a target RNA through complementary base-pairing, and once 
bound to the correct site, acts enzymatically to cleave and inactivate the target RNA. 
Cleavage of a target RNA in such a manner will destroy its abiUty to direct synthesis of an 
encoded protein if the cleavage occurs in the coding sequence. After a ribozyme has bound 
and cleaved its RNA target, it is typically released from that RNA and so can bind and cleave 

new targets repeatedly. 

In some circumstances, the enzymatic nature of a ribozyme can be 
advantageous over other technologies, such as antisense technology (where anucleic acid 
molecule simply binds to a nucleic acid target to block its transcription, translation or 
association with another molecule) as the effective concentration of ribozyme necessary to 
effect a therapeutic treatment can be lower than that of an antisense oUgonucleotide. This 
potential advantage reflects the abiUty of the ribozyme to act enzymatically. Thus, a single 
ribozyme molecule is able to cleave many molecules of target RNA. In addition, a ribozyme 
is typically a highly specific inhibitor, with the specificity of inhibition depending not only on 
the base pairing mechanism of binding, but also on the mechanism by which the molecule 
inhibits the expression of the RNA to which it binds. That is, the inhibition is caused by 
cleavage of the RNA target and so specificity is defined as the ratio of the rate of cleavage of 
the targeted RNA over the rate of cleavage of non-targeted RNA. This cleavage mechanism 
is dependent upon factors additional to those involved in base pairing. Thus, the specificity 
of action of a ribozyme can be greater than that of antisense oUgonucleotide binding the same 
RNA site. 

The enzymatic ribozyme RNA molecule can be formed in a hammeihead 
motif, but may also be formed in the motif of a hairpin, hepatitis delta virus, group I intron or 
RNaseP-Uke RNA (in association with an RNA guide sequence). Examples of such 
hammerhead motifs are described by Rossi (1992) Aids Research and Human Retroviruses 
8:183; hairpin motifs by Hampel (1989) Biochemistry 28:4929, and Hampel (1990) Nuc. 
Acids Res. 18:299; the hepatitis delta virus motif by Perrotta (1992) Biochemistry 31 :16; the 
RNaseP motif by Guenier-Takada (1983) Cell 35:849; and the group I intron by Cech U.S. 
Pat. No. 4,987,071. The recitation of these specific motife is not intended to be limiting; 
those skilled in the art will recognize that an enzymatic RNA molecule of this invention has a 
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specific substrate binding site complementary to one or more of the target gene RNA regions, 
and has nucleotide sequence within or surrounding that substrate binding site which imparts 
an RNA cleaving activity to the molecule. 

RNA interference (RNAi) 

In one aspect, the invention provides an RNA inhibitory molecule, a so-called 
"RNAi" molecule, comprising a phosphoUpase sequence of the invention. The RNAi 
molecule comprises a double-stranded RNA (dsRNA) molecule. The RNAi can inhibit 
expression of a phosphoUpase gene. In one aspect, the RNAi is about 15, 16, 17, 18. 19, 20, 
21, 22, 23, 24, 25 or more duplex nucleotides in length. While the invention is not limited by 
any particular mechanism of action, the RNAi can enter a cell and cause the degradation of a 
single-stranded RNA (ssRNA) of similar or identical sequences, including endogenous 
mRNAs. When a cell is exposed to double-stranded RNA (dsRNA), mRNA from the 
homologous gene is selectively degraded by a process called RNA interference (RNAi). A 
possible basic mechanism behind RNAi is the breaking of a double-stranded RNA (dsRNA) 
matching a specific gene sequence into short pieces called short interfering RNA, which 
trigger the degradation of mRNA that matches its sequence. In one aspect, the RNAi's of the 
invention are used in gene-silencing therapeutics, see, e.g., Shuey (2002) Drug Discov. Today 
7:1040-1046. M one aspect, the invention provides methods to selectively degrade RNA 
using the RNAi's of the invention The process may be practiced in vitro, ex vivo or in vivo. 
In one aspect, the RNAi molecules of the invention can be used to generate a loss-of-fimction 
mutation in a cell, an organ or an animal. Methods for making and using RNAi molecules for 
selectively degrade RNA are well known in the art. see, e.g., U.S. Patent No. 6,506,559; 
6,511.824; 6,515,109; 6.489,127. 

Modification of Nucleic Acids 

The invention provides methods of generating variants of the nucleic acids of 
the invention, e.g., those encoding aphospholipase enzyme. These methods can be repeated 
or used in various combinations to generate phosphoUpase enzymes having an altered or 
different activity or an altered or different stabiUty from that of a phosphoUpase encoded by 
the template nucleic acid. These methods also can be repeated or used in various 
combinations, e.g., to generate variations in gene/ message expression, message translation or 
message stabiUty. In another aspect, the genetic composition of a ceU is altered by, e.g., 
modification of a homologous gene ex vivo, followed by its reinsertion into the ceU. 
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A nucleic acid of the invention can be altered by any means. For example, 
random or stochastic methods, or, non-stochastic, or "directed evolution," methods. 

Methods for random mutation of genes are well known in the art, see, e.g., 
U.S. Patent No. 5,830,696. For example, mutagens can be used to randomly mutate a gene. 
Mutagens include, e.g., ultraviolet Ught or gamma irradiation, or a chemical mutagen, e.g., 
mitomycin, nitrous acid, photoactivated psoralens, alone or in combination, to induce DNA 
breaks amenable to repair by recombination. Other chemical mutagens include, for example, 
sodium bisulfite, nitrous acid, hydroxylamine. hydrazine or formic acid. Other mutagens are 
analogues of nucleotide precursors, e.g., nitrosoguanidine, 5-bromonracil, 2-aminopurine, or 
acridine. Hiese agents can be added to a PGR reaction in place of the nucleotide precursor 
thereby mutating the sequence. Intercalating agents such as proflavine, acriflavine, 
quinacrine and the like can also be used. 

Any technique in molecular biology can be used, e.g., random PGR 
mutagenesis, see, e.g.. Rice (1992) Proc. Natl. Acad. Sci. USA 89:5467-5471; or, 
combinatorial multiple cassette mutagenesis, see, e.g., Crameri (1995) Biotechniques 18:194- 
196. Alternatively, nucleic acids, e.g., genes, can be reassembled after random, or 
"stochastic," fragmentation, see, e.g., U.S. Patent Nos. 6,291,242; 6,287,862; 6,287,861; 
5,955.358; 5,830,721; 5,824,514; 5,811,238; 5,605,793. In alternative aspects, modifications, 
additions or deletions are introduced by error-prone PGR, shuffling, oUgonucleotide-directed 
mutagenesis, assembly PGR, sexual PGR mutagenesis, in vivo mutagenesis, cassette 
mutagenesis, recursive ensemble mutagenesis, exponential ensemble mutagenesis, site- 
specific mutagenesis, gene reassembly, gene site saturated mutagenesis (GSSM), synthetic 
Ugation reassembly (SLR), recombination, recursive sequence recombination, 
phosphothioate-modified DNA mutagenesis, uracil-containing template mutagenesis, gapped 
duplex mutagenesis, point mismatch repair mutagenesis, repair-deficient host strain 
mutagenesis, chemical mutagenesis, radiogenic mutagenesis, deletion mutagenesis, 
restriction-selection mutagenesis, restriction-purification mutagenesis, artificial gene 
synthesis, ensemble mutagenesis, chimeric nucleic acid multimer creation, and/or a 
combination of these and other methods. 

The following publications describe a variety of recursive recombination 
procedures and/or methods which can be incorporated into tiie methods of the invention: 
Stemmer (1999) "Molecular breeding of vhruses for targeting and other clinical properties" 
Tumor Targeting 4:1-4; Ness (1999) Nature Biotechnology 17:893-896; Ghang (1999) 
•Evolution of a cytokine using DNA femily shuffling" Nature Biotechnology 17:793-797; 

HA 
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MinshuU (1999) 'Trotein evolution by molecular breeding" Current Opinion in Chemical 
Biology 3:284-290; Christians (1999) "Directed evolution of thymidine kinase for AZT 
phosphorylation using DNA family shuffling" Natiire Biotechnology 17:259-264; Crameri 
(1998) "DNA shuffling of a family of genes from diverse species accelerates directed 
evolution" Nature 391:288-291; Crameri (1997) "Molecular evolution of an arsenate 
detoxification pathway by DNA shufiQing." Nature Biotechnology 15:436-438; Zhang (1997) 
"Directed evolution of an effective fucosidase from a galactosidase by DNA shuffling and 
screening" Proc. Nati. Acad. Sci. USA 94:4504-4509; Patten et al. (1997) "AppUcations of 
DNA Shuffling to Pharmaceuticals and Vaccines" Current Opinion in Biotechnology 8:724- 
733; Crameri et al. (1996) "Construction and evolution of antibody-phage Ubraries by DNA 
shuffling" Nature Medicine 2:100-103; Crameri et al. (1996) "Improved green fluorescent 
protein by molecular evolution using DNA shuffling" Nature Biotechnology 14:315-319; 
Gates et al. (1996) "Affinity selective isolation of Ugands from peptide Ubraries through 
display on a lac repressor 'headpiece dimer'" Journal of Molecular Biology 255:373-386; 
Stemmer (1996) "Sexual PGR and Assembly PCR" hi: The Encyclopedia of Molecular 
Biology. VCH Publishers, New York, pp.447-457; Crameri and Stemmer (1995) 
"Combinatorial multiple cassette mutagenesis creates all the permutations of mutant and 
wildtype cassettes" BioTechniques 18:194-195; Stemmer et al. (1995) "Single-step assembly 
of a gene and entire plasmid form large numbers of oUgodeoxyribonucleotides" Gene, 
164:49-53; Stemmer (1995) "The Evolution of Molecular Computation" Science 270: 1510; 
Stemmer (1995) "Searching Sequence Space" Bio/Technology 13:549-553; Stemmer (1994) 
"Rapid evolution of a protein in vitro by DNA shuffling" Nature 370:389-391; and Stemmer 
(1994) "DNA shuffling by random fragmentation and reassembly: In vitix> recombmation for 
molecular evolution." Proc. Nati. Acad. Sci. USA 91:10747-10751. 

Mutational metiiods of generating diversity include, for example, site-directed 
mutagenesis (Ling et al. (1997) "Approaches to DNA mutagenesis: an overview" Anal 
Biochem. 254(2): 157-178; Dale et al. (1996) "OUgonucleotide-directed random mutagenesis 
using the phosphorotiiioate metiiod" Metiiods Mol. Biol. 57:369-374; Smith (1985) "In vita) 
mutagenesis" Ann. Rev. Genet. 19:423-462; Botstein & Shortle (1985) "Strategies and 
^pUcations of m vitro mutagenesis" Science 229:1193-1201; Carter (1986) "Site-directed 
mutagenesis" Biochem. J. 237:1-7; and Kunkel (1987) "The efficiency of oligonucleotide 
directed mutagenesis" in Nucleic Acids & Molecular Biology (Eckstein, F. and Lilley, D. M. 
J. eds.. Springer Verlag, Berlin)); mutagenesis using uracil containing templates (Kunkel 
(1985) "Rq>id and efficient site-specific mutagenesis without phenotypic selection" Proc. 
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Natl. Acad. Sci. USA 82:488-492; Kunkel et al. (1987) "Rapid and efficient site-specific 
mutagenesis without phenotypic selection" Methods in Enzymol. 154, 367-382; and Bass et 
al. (1988) "Mutant Trp repressors with new DNA-hinding specificities" Science 242:240- 
245); oUgonucleotide-directed mutagenesis (Methods in Enzymol. 100: 468-500 (1983); 
5 Methods in Enzymol. 154: 329-350 (1987); Zoller & Smith (1982) "OUgonucleotide-directed 
mutagenesis using M13-derived vectors: an efficient and general procedure for the 
production of point mutations in any DNA fiagment" Nucleic Acids Res. 10:6487-6500; 
Zoller & Smith (1983) "OUgonucleotide-directed mutagenesis of DNA firagments cloned into 
M13 vectors" Methods in EnzymoL 100:468-500; and Zoller & Smith (1987) 
10 "OUgonucleotide-directed mutagenesis: a simple method using two oUgonucleotide primers 
and a single-stranded DNA template" Methods in Enzymol. 154:329-350); phosphorothioate- 
modified DNA mutagenesis (Taylor et al. (1985) "The use of phosphorothioate-modified 
DNA in restriction enzyme reactions to prepare nicked DNA" Nucl. Acids Res. 13: 8749- 
8764; Taylor et al. (1985) "The rapid generation of oUgonucleotide-directed mutations at high 
15 frequency using phosphorothioate-modified DNA" Nucl. AcidsRes. 13: 8765-8787 (1985); 
Nakamaye (1986) "Inhibition of restriction endonuclease Nci I cleavage by phosphorothioate 
groups and its appUcationto oUgonucleotide^ected mutagenesis" Nucl. Acids Res. 14: 
9679-9698; Sayers et al. (1988) "Y-T Exonucleases in phosphorothioate-based 
oUgonucleotide-directed mutagenesis" NucL Adds Res. 16:791-802; and Sayers et al. (1988) 
20 "Strand specific cleavage of phosphorothioate-containing DNA by reaction with restriction 
endonucleases in the presence of ethidium bromide" Nucl. Acids Res. 16: 803-814); 
mutagenesis using g^ped duplex DNA (Kramer et al. (1984) "The g^ed duplex DNA 
approach to oUgonucleotide-directed mutation construction" Nucl. Acids Res. 12: 9441-9456; 
Kramer & Fritz (1987) Methods in Enzymol. "OUgonucleotide-directed construction of 
25 mutations via gapped duplex DNA" 154:350-367; Kramer et al. (1988) "Improved enzymatic 
in vitro reactions in the gapped duplex DNA approach to oUgonucleotide-directed 
construction of mutations" Nucl. Acids Res. 16: 7207; and Fritz et al. (1988) 
"OUgonucleotide-directed construction of mutations: a gapped duplex DNA procedure 
without enzymatic reactions in vitro" Nucl. Acids Res. 16: 6987-6999). 
30 Additional protocols used in the methods of the invention include point 

mismatch repair (Kramer (1984) "Point Mismatch Repair" CeU 38:879-887). mutagenesis 
using repair-deficient host strains (Carter et al. (1985) "Improved oUgpnucleotide site- 
directed mutagenesis using M13 vectors" Nucl. Acids Res. 13: 4431-4443; and Carter (1987) 

"Improved oUgonucleotide-directed mutagenesis using M13 vectors" Methods in Enzymol. 
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154: 382-403), deletion mutagenesis (Eghtedaizadeh (1986) "Use of oUgonucleotides to 
generate large deletions" Nucl. Acids Res. 14: 5115), restriction-selection and restriction- 
selection and restriction-purification (WeUs et al. (1986) "Importance of hydrogen-bond 
formation in stabiUzing the transition state of subtilisin" Phil. Trans. R. Soc. Lond. A 317: 
415-423), mutagenesis by total gene synthesis (Nambiar et al. (1984) "Total synthesis and 
cloning of a gene coding for the ribonucleaseSprotem" Science 223: 1299-1301; Sakamar 
and Khorana (1988) "Total synthesis and expression of a gene for the a-subunit of bovine rod 
outer segment guanine nudeotide-binding protein (transducin)" Nucl. Acids Res. 14: 6361- 
6372; Wells et al. (1985) "Cassette mutagenesis: an efficient method for generation of 
multiple mutations at defined sites" C3ene 34:315-323; and Gnmdstrom et al. (1985) 
"OUgonucleotide-directed mutagenesis by microscale 'shot-gun' gene synthesis" Nucl. Acids 
Res. 13: 3305-3316), double-strand break repair (Mandecki (1986); Arnold (1993) "Protein 
engineering for unusual environments" Current Opinion in Biotechnology 4:450455. 
"OUgonucleotide-directed double-strand break repair in plasmids of Escherichia coh: a 
method for site-specific mutagenesis" Proc. Natl. Acad. Sci. USA, 83:7177-7181). Additional 
details on many of the above methods can be found in Methods in Enzymology Volume 154, 
which also describes usefiil controls for trouble-shooting problems with various mutagenesis 
methods. 

See also U.S. Patent Nos. 5.605.793 to Stemmer (Feb. 25, 1997), "Methods for 
In Vitro Recombination;" U.S. Pat. No. 5,811,238 to Stemmer et al. (Sep. 22, 1998) 
"Methods for Generating Polynucleotides having Desired Characteristics by Iterative 
Selection and Recombination;" U.S. Pat. No. 5,830,721 to Stemmer et al. (Nov. 3, 1998), 
"DNA Mutagenesis by Random Fragmentation and Reassembly;" U.S. Pat. No. 5,834,252 to 
Stemmer, et al. (Nov. 10, 1998) "End-Complementary Polymerase Reaction;" U.S. Pat. No. 
5,837,458 to Minshull, et al. (Nov. 17, 1998), "Methods and Compositions for CeUular and 
MetaboUc Engineering;" WO 95/22625. Stemmer and Crameri, "Mutagenesis by Random 
Fragmentation and Reassembly;" WO 96/33207 by Stemmer and Lipschutz "End 
Complementary Polymerase Chain Reaction;" WO 97/20078 by Stemmer and Crameri 
"Metiiods for Generating Polynucleotides having Desired Characteristics by Iterative 
Selection and Recombination;" WO 97/35966 by Minshull and Stemmer. "Methods and 
Compositions for Cellular and MetaboUc Engineering;" WO 99/41402 by Punnonen et al. 
"Targeting of Genetic Vaccine Vectors;" WO 99/41383 by Punnonen et al. "Antigen Ubrary 
Immunization;" WO 99/41369 by Punnonen et al. "Genetic Vaccine Vector Engineering;" 
WO 99/41368 by Punnonen et aL "Optimization of Immunomodulatory Properties of Genetic 
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Vaccines;" BP 752008 by Stemmer and Crameri, "DNA Mutagenesis by Random 
Fragmentation and Reassembly;" EP 0932670 by Stemmer "Evolving Cellular DNA Uptake 
by Recursive Sequence Recombination;" WO 99/23 107 by Stemmer et al., "Modification of 
Virus Tropism and Host Range by Viral Genome Shuffling;" WO 99/21979 by Apt et al., 
"Human Papillomavirus Vectors;" WO 98/31837 by del Cardayre et al. "Evolution of Whole 
CeUs and Organisms by Recursive Sequence Recombination;" WO 98/27230 by Patten and 
Stemmer, "Methods and Compositions for Polypeptide Engineering;" WO 98/27230 by 
Stemmer et al., "Methods for Optimization of Gene Therapy by Recursive Sequence 
Shuflflmg and Selection," WO 00/00632, "Methods for Generating Highly Diverse libraries," 
WO 00/09679, "Methods for Obtaining in Vitro Recombined Polynucleotide Sequence Banks 
and Resulting Sequences," WO 98/42832 by Arnold et al., "Recombination of Polynucleotide 
Sequences Using Random or Defined Primers," WO 99/29902 by Arnold et al., "Method for 
Creating Polynucleotide and Polypeptide Sequences," WO 98/41653 by Vind, "An in Vitro 
Method for Construction of a DNA Library," WO 98/41622 by Borchert et al., "Method for 
Constructing a Library Using DNA Shuffling," and WO 98/42727 by Pati and Zarling, 
"Sequence Alterations using Homologous Recombination." 

Certain U.S. ^pUcations provide additional details regarding various diversity 
generating metiiods, including "SHUFFLING OF CODON ALTERED GENES" by Patten et 
al. filed Sep. 28, 1999, (U.S. Ser. No. 09/407,800); "EVOLUTION OF WHOLE CELLS 
AND ORGANISMS BY RECURSIVE SEQUENCE RECOMBINATION" by del Cardayre 
et al., filed Jul. 15, 1998 (U.S. Ser. No. 09/166,188), and Jul. 15, 1999 (U.S. Ser. No. 
09/354,922); "OLIGONUCLEOTIDE MEDIATED NUCLEIC ACID RECOMBDSfATION" 
by Crameri et al., filed Sep. 28, 1999 (U.S. Ser. No. 09/408,392), and 

"OLIGONUCLEOTIDE MEDIATED NUCLEIC ACID RECOMBINATION" by Crameri et 
al.. filed Jan. 18, 2000 (PCT/USOO/01203); "USE OF CODON-VARIED 
OLIGONUCLEOTIDE SYNTHESIS FOR SYNTHETIC SHUFFLING" by Welch et al., 
filed Sep. 28, 1999 (U.S. Ser. No. 09/408,393); "METHODS FOR MAKING CHARACTER 
STRINGS. POLYNUCLEOTIDES & POLYPEPTIDES HAVING DESIRED 
CHARACTERISTICS" by SeUfonov et al., filed Jan. 18, 2000. (PCT/USOO/01202) and, e.g. 
•TVIETHODS FOR MAKING CHARACTER STRINGS, POLYNUCLEOTIDES & 
POLYPEPTIDES HAVING DESIRED CHARACTERISTICS" by SeUfonov et al., filed Jul. 
18, 2000 (U.S. Ser. No. 09/618,579); "METHODS OF POPULATING DATA 
SmUCTURES FOR USE IN EVOLUTIONARY SIMULATIONS" by Selifonov and 

Stemmer, filed Jan. 18, 2000 (PCT/USOO/01138); and "SINGLE-STRANDED NUCLEIC 
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ACID TEMPLATE-MEDIATED RECOMBINATION AND NUCLEIC ACID FRAGMENT 
ISOLATION" by Affholter, filed Sep. 6, 2000 (U.S. Ser. No. 09/656,549). 

Non-stochastic, or "directed evolution," methods include, e.g., saturation 
mutagenesis (GSSM), synthetic Ugation reassembly (SLR), or a combination thereof are used 
to modify the nucleic acids of the invention to generate phosphoUpases with new or altered 
properties (e.g., activity under highly acidic or alkaline conditions, high temperatures, and the 
like). Polypeptides encoded by the modified nucleic acids can be screened for an activity 
before testing for an phosphoUpase or other activity. Any testing modality or protocol can be 
used, e.g., using a capillary array platform. See, e.g., U.S. Patent Nos. 6,280,926; 5,939,250. 

Saturation mutagenesis, or. GSSM 

In one aspect of the invention, non-stochastic gene modification, a "directed 
evolution process," is used to generate phosphoUpases with new or altered properties. 
Variations of this method have been termed "gene site-saturation mutagenesis," "site- 
saturation mutagenesis," "saturation mutagenesis" or simply "GSSM." It can be used in 
combination with other mutagenization processes. See, e.g., U.S. Patent Nos. 6,171,820; 
6,238,884. In one aspect, GSSM comprises providing a template polynucleotide and a 
plurality of oligonucleotides, wherein each oUgonucleotide comprises a sequence 
homologous to the template polynucleotide, thereby targeting a specific sequence of the 
template polynucleotide, and a sequence that is a variant of the homologous gene; generating 
progeny polynucleotides comprising non-stochastic sequence variations by repUcating the 
template polynucleotide with the oligonucleotides, thereby generating polynucleotides 
comprising homologous gene sequence variations. 

In one aspect, codon primers containing a degenerate N,N,G/T sequence are 
used to introduce point mutations into a polynucleotide, so as to generate a set of progeny 
polypeptides in which a fiill range of single amino acid substitutions is represented at each 
amino acid position, e.g., an amino acid residue in an enzyme active site or ligand binding 
site targeted to be modified. These oUgonucleotides can comprise a contiguous first 
homologous sequence, a degenerate N,N,G/T sequence, and, optionally, a second 
homologous sequence. The downstream progeny translational products firom the use of such 
oUgonucleotides include all possible amino acid changes at each amino acid site along the 
polypeptide, because the degeneracy of the NJ^.G/T sequence includes codons for all 20 
amino acids. In one aspect, one such degenerate oUgonucleotide (comprised of, e.g., one 
degenerate N,N,G/T cassette) is used for subjecting each original codon in a parental 
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polynucleotide template to a fall range of codon substitutions. In another aspect, at least two 
degenerate cassettes are used - either in the same oUgonucleotide or not, for subjecting at 
least two original codons in a parental polynucleotide template to a fall range of codon 
substitutions. For example, more than one N,N.G/T sequence can be contained in one 
oUgonucleotide to introduce amino acid mutations at more than one site. This plurality of 
N,N,G/T sequences can be directly contiguous, or separated by one or more additional 
nucleotide sequence(s). Jn another aspect, oUgonucleotides serviceable for introducing 
additions and deletions can be used either alone or in combination with llie codons containing 
an N,N,G/T sequence, to introduce any combination or permutation of amino add additions, 
deletions, and/or substitutions. 

In one aspect, simultaneous mutagenesis of two or more contiguous amino 
acid positions is done using an oUgonucleotide that contains contiguous N,N,Gn" triplets, i.e. 
a degenerate (N,N.G/T)n sequence. In another aspect, degenerate cassettes having less 
degeneracy than the N.N,G/T sequence are used. For example, it maybe desirable in some 
instances to use (e.g. in an oUgonucleotide) a degenerate triplet sequence comprised of only 
one N, where said N can be in the first second or third position of the triplet. Any other bases 
including any combinations and permutations thereof can be used in the remaining two 
positions of the triplet. Altematively, it maybe desirable in some mstances to use (e.g. in an 
oUgo) a degenerate N,N,N triplet sequence. 

In one aspect, use of degenerate triplets (e.g., N,N,G/T triplets) allows for 
systematic and easy generation of a faU range of possible natural amino acids (for a total of 
20 amino acids) into each and every amino acid position in a polypeptide (in alternative 
aspects, the methods also include generation of less than all possible substitutions per amino 
acid residue, or codon, position). For example, for a 100 amino acid polypeptide, 2000 
distinct species (i.e. 20 possible amino acids per position X 100 amino acid.positions) can be 
generated. Through the use of an oUgonucleotide or set of oUgonucleotides containing a 
degenerate N,N,G/T triplet, 32 individual sequences can code for aU 20 possible natural 
amino acids. Thus, in a reaction vessel in which a parental polynucleotide sequence is 
subjected to saturation mutagenesis using at least one such oligonucleotide, there are 
generated 32 distinct progeny polynucleotides encoding 20 distinct polypeptides. In contrast, 
the use of anon-degenerate oUgonucleotide in site-directed mutagenesis leads to only one 
progeny polypeptide product per reaction vessel. Nondegenerate oUgonucleotides can 
optionally be used in combination with degenerate primers disclosed; for example, 
nondegenerate oUgonucleotides can be used to generate specific point mutations in a workmg 



70 



09010-094001 

wo 03/089620 8=^ T/ U S O 3 ^'ycTrnTomfsse 

polynucleotide. This provides one means to generate specific silent point mutations, point 
mutations leading to corresponding amino acid changes, and point mutations fliat cause the 
generation of stop codons and the corresponding expression of polypeptide fragments. 

In one aspect, each saturation mutagenesis reaction vessel contains 
5 polynucleotides encoding at least 20 progeny polypeptide (e.g., phospholipase) molecules 
such that all 20 natural amino acids are represented at the one specific amino acid position 
corresponding to the codon position mutagenized in the parental polynucleotide (other 
aspects use less than all 20 natural combinations). The 32-fold degenerate progeny 
polypeptides generated from each saturation mutagenesis reaction vessel can be subjected to 
10 clonal aiiq)Ufication (e.g. cloned into a suitable host, e.g., E. coli host, using, e.g., an 
expression vector) and subjected to expression screening. When an individual progeny 
polypeptide is identified by screening to display a favorable change in property (when 
compared to the parental polypeptide, such as increased phosphoUpase activity under alkaline 
or acidic conditions), it can be sequenced to identify the correspondingly fevorable amino 
1 5 acid substitution contained therein. 

In one aspect, upon mutagenizing each and every amino acid position in a 
parental polypeptide using saturation mutagenesis as disclosed herein, favorable amino acid 
changes maybe identified at more than one amino acid position. One or more new progeny 
molecules can be generated tiiat contain a combination of all or part of tiiese favorable amino 
20 acid substitutions. For example, if 2 specific fevorable amino acid changes are identified in 
each of 3 amino acid positions in a polypeptide, the permutations include 3 possibiHties at 
each position (no change from the origmal amino add, and each of two favorable changes) 
and 3 positions. Thus, tiiere are 3 x 3 x 3 or 27 total possibiUties, includmg 7 that were 
previously examined - 6 single point mutations (i.e. 2 at each of tiiree positions) and no 

25 change at any position. 

In another aspect, site-saturation mutagenesis can be used together with 
another stochastic or non-stochastic means to vary sequence, e.g., synthetic ligation 
reassembly (see below), shuffling, chimerization, recombination and other mutagenizing 
processes and mutagenizing agents. This invention provides for the use of any mutagenizing 

30 process(es), including saturation mutagenesis, in an iterative manner. 

Synthetic Ligation Reassembly (SLR) 

The invention provides a non-stochastic gene modification system termed 
"syntiietic ligation reassembly," or simply "SLR." a "directed evolution process," to generate 
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phosphoUpases with new or altered properties. SUR is a method of Hgating oUgonucleotide 
fragments together non-stochasticaUy. This mefliod diflEers from stochastic oUgonucleotide 
shuffling in that the nucleic acid building blocks are not shuffled, concatenated or chimerized 
randomly, but rather are assembled non-stochastically See, e.g., U.S. Patent AppUcation 
5 Serial No. (USSN) 09/332,835 entitled "Synthetic Ligation Reassembly in Directed 

Evolution" and filed on June 14, 1999 ("USSN 09/332,835"). In one aspect, SLR comprises 
the following steps: (a) providing a template polynucleotide, wherein the template 
polynucleotide comprises sequence encoding a homologous gene; (b) providing a pluraUty 
of buUding block polynucleotides, wherein the building block polynucleotides are designed to 
10 cross-over reassemble with the template polynucleotide at a predetennined sequence, and a 
building block polynucleotide comprises a sequence that is a variant of the homologous gene 
and a sequence homologous to llie template polynucleotide flanking Ihe variant sequence; (c) 
combining a building block polynucleotide with a template polynucleotide such that the 
building block polynucleotide cross-over reassembles with the template polynucleotide to 
15 generate polynucleotides comprising homologous gene sequence variations. 

SLR does not depend on llie presence of high levels of homology between 
polynucleotides to be rearranged. Thus, this method can be used to non-stochastically 
generate libraries (or sets) of progeny molecules comprised of over 10^°° different chimeras. 
SLR can be used to generate libraries comprised of over 10>°°° different progeny chimeras. 
20 Thus, aspects of tiie present invention include non-stochastic methods of producing a set of 
finalized chimeric nucleic acid molecule shaving an overall assembly order that is chosen by 
design. This metiiod includes the steps of generating by design a plurality of specific nucleic 
acid building blocks having serviceable mutually compatible Ugatable ends, and assembling 
these nucleic acid building blocks, such tiiat a designed overall assembly order is achieved. 
25 The mutuaUy compatible Ugatable ends of the nucleic acid building blocks to 

be assembled are considered to be "serviceable" for this type of ordered assembly if they 
enable the building blocks to be coupled in predetennined orders. Thus the overall assembly 
order in which the nucleic acid building blocks can be coupled is specified by the design of 
tiie Ugatable ends. If more than one assembly step is to be used, then the overall assembly 
30 order in which the nucleic acid building blocks can be coupled is also specified by tiie 
sequential order of tiie assembly step(s). In one aspect, the annealed building pieces are 
treated witii an enzyme, such as a Ugase (e.g. T4 DNA Ugase), to achieve covalent bonding of 
the building pieces. 
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In one aspect, the design of the oligonucleotide building blocks is obtained by 
analyzing a set of progenitor nucleic acid sequence templates that serve as a basis for 
producing a progeny set of finalized chimeric polynucleotides. These parental 
oligonucleotide templates thus serve as a source of sequence information that aids m the 
design of the nucleic acid building blocks that are to be mutagenized, e.g., chimerized or 
sihufQed. 

In one aspect of this method, the sequences of a plurality of parental nucleic 
acid templates are aligned in order to select one or more demarcation points. The 
demarcation points can be located at an area of homology, and are comprised of one or more 
nucleotides. These demarcation points are preferably shared by at least two of the progenitor 
templates. The demarcation points can thereby be used to delineate the boundaries of 
oligonucleotide building blocks to be generated in order to rearrange the parental 
polynucleotides. The demarcation points identified and selected in the progenitor molecules 
serve as potential chimerization points in the assembly of the final chimeric progeny 
molecules. A demarcation point can be an area of homology (comprised of at least one 
homologous nucleotide base) shared by at least two parental polynucleotide sequences. 
Alternatively, a demarcation point can be an area of homology that is shared by at least half 
of the parental polynucleotide sequences, or, it can be an area of homology that is shared by 
at least two thirds of the parental polynucleotide sequences. Even more preferably a 
serviceable demarcation points is an area of homology that is shared by at least three fourths 
of the parental polynucleotide sequences, or, it can be shared by at almost aU of the parental 
polynucleotide sequences. In one aspect, a demarcation point is an area of homology that is 
shared by all of the parental polynucleotide sequences. 

In one aspect, a Hgation reassembly process is performed exhaustively in order 
to generate an exhaustive library of progeny chimeric polynucleotides. In other words, all 
possible ordered combinations of the nucleic acid building blocks are represented in the set of 
finalized chimeric nucleic acid molecules. At the same time, in another embodiment, the 
assembly order (i.e. the order of assembly of each building block in the 5' to 3 sequence of 
each finalized chimeric nucleic acid) in each combination is by design (or non-stochastic) as 
described above. Because of the non-stochastic nature of this invention, the possibiUty of 
unwanted side products is greatiy reduced. 

In another aspect, the Ugation reassembly method is performed systematically. 
For example, the method is performed in order to generate a systematically compart- 
mentalized Ubrary of progeny molecules, with compartments that can be screened 
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systematicaUy, e.g. one by one. In other words this invention provides that, through the 
selective and judicious use of specific nucleic acid building blocks, coupled with the selective 
and judicious use of sequentially stepped assembly reactions, a design can be achieved where 
specific sets of progeny products are made in each of several reaction vessels. This allows a 
systematic examination and screening procedure to be performed. Thus, these methods allow 
a potentially very large number of progeny molecules to be examined systematically in 
smaller groups. Because of its abiUty to perform chimerizations in a manner that is highly 
flexible yet exhaustive and systematic as weU, particularly when there is a low level of 
homology among tiae progenitor molecules, these methods provide for the generation of a 
Ubrary (or set) comprised of a large number of progeny molecules. Because of the non- 
stochastic nature of the instant Ugation reassembly mvention, the progeny molecules 
generated preferably comprise a Ubrary of finalized chimeric nucleic acid molecules having 
an overall assembly order that is chosen by design. The satiiration mutagenesis and 
optimized directed evolution methods also can be used to generate different progeny 
molecular species. It is appreciated that the invention provides fireedom of choice and contix)l 
regarding tiie selection of demarcation points, the size and number of the nucleic acid 
building blocks, and the size and design of the coupUngs. It is appreciated, furthermore, that 
the requirement for intermolecular homology is highly relaxed for the operabiUty of tiiis 
invention. Jn fact, demarcation points can even be chosen in areas of Uttle or no 
intermolecular homology. For example, because of codon wobble, i.e. the degeneracy of 
codons, nucleotide substitoitions can be introduced into nucleic acid building blocks witiiout 
altering the amino acid originally encoded in tiie corresponding progenitor template. 
Alternatively, a codon can be altered such that the coding for an originally amino acid is 
altered. This invention provides tiiat such substitutions can be introduced into the nucleic 
acid building block in order to increase the incidence of intermolecularly homologous 
demarcation points and thus to allow an increased number of couplings to be achieved among 
the building blocks, which in tiam aUows a greater number of progeny chimeric molecules to 
be generated. 

hi another aspect, tiie synthetic nature of the step in which the building blocks 
are generated allows the design and introduction of nucleotides (e.g., one or more 
nucleotides, which may be, for example, codons or introns or regulatory sequences) tiiat can 
later be optionally removed m an in vitro process (e.g. by mutagenesis) or in an in vivo 
process (e.g. by utihzing tiie gene spUcing abiUty of a host organism). It is appreciated tiiat in 
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many instances the introduction of these nucleotides may also be desirable for many other 
reasons in addition to the potential benefit of creating a serviceable demarcation point. 

In one aspect, a nucleic acid building block is used to introduce an intron. 
Thus, functional introns are introduced into a man-made gene manufactured according to the 
methods described herein. The artificially introduced intron(s) can be functional m a host 
cells for gene spUcing much in the way that naturally-occurring introns serve functionaUy in 
gene splidng. 

Optimized Directed Evolution System 

The invention provides a non-stochastic gene modification system termed 
"optimized directed evolution system" to generate phospholipases with new or altered 
properties. Optimized directed evolution is directed to the use of repeated cycles of reductive 
reassortment, recombination and selection tiiat aUow for the directed molecular evolution of 
nucleic acids through recombination. Optimized directed evolution allows generation of a 
large population of evolved chimeric sequences, wherein the generated population is 
significantly emiched for sequences tiiat have a predetermined number of crossover events. 

A crossover event is a point in a chimeric sequence where a shift in sequence 
occurs from one parental variant to another parental variant. Such a point is normally at the 
juncture of where oligonucleotides firom two parents are Ugated together to form a single 
sequence. This method allows calculation of the correct concentirations of oUgonucleotide 
sequences so that the final chimeric population of sequences is enriched for the chosen 
number of crossover events. This provides more contx)l over choosing chimeric variants 
having a predetermined number of crossover events. 

In addition, this method provides a convenient means for e3q)loring a 
tremendous amount of tiie possible protein variant space in comparison to other systems. 
Previously, if one generated, for example, 10^^ chimeric molecules during a reaction, it would 
be ejrtremely difficult to test such a high number of chimeric variants for a particular activity. 
Moreover, a significant portion of the progeny population would have a very high number of 
crossover events which resulted in proteins that were less hkely to have increased levels of a 
particular activity. By usmg these methods, the population of chimerics molecules can be 
enriched for those variants that have a particular number of crossover events. Thus, although 
one can still generate lO" chimeric molecules during a reaction, each of the molecules 
chosen for fiuther analysis most likely has, for example, only three crossover events. 
Because the resulting progeny population can be skewed to have a predetermined number of 
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crossover events, the boundaries on the functional variety between the chimeric molecules is 
reduced. This provides a more manageable number of variables when calculating which 
oligonucleotide from the original parental polynucleotides might be responsible for affecting 
a particular trait. 

One method for creating a chimeric progeny polynucleotide sequence is to 

create oligonucleotides corresponding to fragments or portions of each parental sequence. 

Each oligonucleotide preferably includes a unique region of overlap so that mixing the 

oligonucleotides together results in a new variant that has each oUgonucleotide fragment 

assembled in the correct order. Additional information can also be found in USSN 

09/332,835. The number of oUgonucleotides generated for each parental variant bears a 

relationship to the total number of resulting crossovers in the chhneric molecule that is 

ultimately created. For example, three parental nucleotide sequence variants might be 

provided to undergo a ligation reaction in order to find a chimeric variant having, for 

example, greater activity at high temperature. As one example, a set of 50 oUgonucleotide 

sequences can be generated corresponding to each portions of each parental variant. 

Accordingly, during the Ugation reassembly process there could be up to 50 crossover events 

within each of the chimeric sequences. The probability that each of the generated chimeric 

polynucleotides will contain oUgonucleotides from each parental variant in alternating order 

is very low. If each oUgonucleotide fragment is present in the ligation reaction in tiie same 

molar quantity it is Ukely that in some positions oUgonucleotides from the same parental 

polynucleotide will Ugate next to one another and thus not result in a crossover event. If the 

concentration of each oUgonucleotide from each parent is kept constant during any Ugation 

step in this example, there is a 1/3 chance (assmning 3 parents) that an oUgonucleotide from 

the same parental variant wiU Ugate within the chimeric sequence and produce no crossover. 

Accordingly, a probabiUty density function (PDF) can be deteraiined to 

predict the population of crossover events that are Ukely to occur during each step in a 

Ugation reaction given a set number of parental variants, a number of oUgonucleotides 

corresponding to each variant, and the concentrations of each variant during each step in the 

Ugation reaction. The statistics and matiiematics behind determining the PDF is described 

below. By utiUzing these metiiods, one can calculate such a probabiUty density function, and 

thus enrich the chimeric progeny population for a predetermined number of crossover events 

resulting from a particular Ugation reaction. Moreover, a target number of crossover events 

can be predetermined, and tiie system then programmed to calculate tiie starting quantities of 

each parental oUgonucleotide during each step in the Ugation reaction to result in a 
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probabiUty density function that centers on the predetennined number of crossover events. 
These methods are directed to the use of repeated cycles of reductive reassortment, 
recombination and selection that allow for the directed molecular evolution of anucleic add 
encoding an polypeptide through recombination. This system allows generation of a large 
population of evolved chimeric sequences, wherein the generated population is significantly 
enriched for sequences that have a predetermined number of crossover events. A crossover 
event is a point in a chimeric sequence where a shift in sequence occurs firom one parental 
variantto another parental variant. Such a point is normally at the juncture of where 
oUgonudeotides from two parents are Ugated together to form a single sequence. The 
method allows calculation of the correct concentrations of oUgonucleotide sequences so that 
the final chimeric population of sequences is enriched for the chosen number of crossover 
events. This provides more control over choosing chimeric variants having a predetermined 

number of crossover events. 

In addition, tiiese methods provide a convenient means for exploring a 
tremendous amount of tiie possible protein variant space in comparison to other systems. By 
using the methods described herein, the population of chimerics molecules can be enriched 
fortiiose variants tiiathaveaparticularnumberof crossover events. Thus, although one can 
still generate 10^" chimeric molecules during a reaction, each of the molecules chosen for 
further analysis most hkely has, for example, only three crossover events. Because the 
resultingprogeny population can be skewed to have a predetermined number of crossover 
events, the boundaries on the fimctional variety between the chimeric molecules is reduced. 
This provides a more manageable number of variables when calculating which 
oUgonucleotide from the original parental polynucleotides might be responsible for affecting 
a particular trait. 

In one aspect, tiie method creates a chimeric progeny polynucleotide sequence 
by creating oUgonudeotides corresponding to fragments or portions of each parental 
sequence. Bach oUgonucleotide preferably includes a unique region of overlap so that mixing 
the oUgonudeotides together results in a new variant that has each oUgonucleotide fragment 
assembled in the correct order. See also USSN 09/332,835. 

The number of oUgonudeotides generated for each parental variant bears a 

relationship to the total number of resulting crossovers in the chimeric molecule that is 

ultimately created. For example, three parental nucleotide sequence variants might be 

provided to undergo a Ugation reaction in order to find a chimeric variant having, for 

example, greater activity at high temperature. As one example, a set of 50 oUgonucleotide 
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sequences can be generated corresponding to each portions of each parental variant. 
Accordingly, during the Ugation reassembly process there could be up to 50 crossover events 
within each of the chimeric sequences. The probabiUty that each of the generated chimeric 
polyaucleotides will contain oUgonucleotides from each parental variant in alternating order 
is very low. If each oUgonucleotide fragment is present in the Ugation reaction in the same 
molar quantity it is likely that in some positions oHgonucleotides from the same parental 
polynucleotide will Ugate next to one another and thus not result in a crossover event. If the 
concentration of each oUgonucleotide from each parent is kept constant during any ligation 
step in this example, there is a 1/3 chance (assuming 3 par^ts) that a oUgonucleotide from 
the same parental variant will Ugate within the chimeric sequence and produce no crossover. 

Accordingly, a probabiUty density fimction (PDF) can be deteraiined to 
predict the population of crossover events that are Ukely to occur during each step in a 
Ugation reaction given a set number of parental variants, anumber of oUgonucleotides 
corresponding to each variant, and the concentrations of each variant during each step in the 
Ugation reaction. The statistics and mathematics behind determining the PDF is described 
below. One can calculate such a probabiUty density function, and thus enrich the chimeric 
progeny population for a predetemuned number of crossover events resulting from a 
particular Ugation reaction. Moreover, a target number of crossover events can be 
predetermined, and the system then programmed to calculate the starting quantities of each 
parental oUgonucleotide during each step in the Ugation reaction to result in a probabiUty 
density ftinction that centers on llie predetennined number of crossover events. 

Determining Crossover Events 

Embodiments of the invention include a system and software that receive a 
desired crossover probabiUty density function (PDF), the number of parent genes to be 
reassembled, and the number of fragments in the reassembly as inputs. The output of this 
program is a "fragment PDF" that can be used to determine a recipe for producing 
reassembled genes, and the estimated crossover PDF of those genes. The processing 
described herein is preferably performed in MATLAB® (The Mathworks, Natick, 
Massachusetts) a programming language and development environment for technical 
computing. 

Iterative Processes 

In practicing the invention, these processes can be iteratively repeated. For 
example a nucleic acid (or, the nucleic acid) responsible for an altered phosphoUpase 
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phenotype is identified, re-isolated, again modified, re-tested for activity. This process can be 
iteratively repeated until a desired phaiotype is engineered. For example, an entire 
biochemical anabolic or cataboUc pathway can be engineered into a cell, including 

phospholipase activity. 

Similarly, if it is determined that a particular oligonucleotide has no afifect at 
all on the desired trait (e.g., a new phospholipase phenotype), it can be removed as a variable 
by synthesizing larger parental oUgonucleotides that include the sequence to be removed. 
Since incorporating the sequence within a larger sequence prevents any crossover events, 
there will no longer be any variation of this sequence in the progeny polynucleotides. This 
iterative practice of determining which oligonucleotides are most related to the desired trait, 
and which are unrelated, allows more efiacient exploration all of the possible protein variants 
that might be provide a particular trait or activity. 

In vivo shuffling 

In vivo shuffling of molecules is use in methods of the invention that provide 
variants of polypeptides of the invention, e.g., antibodies, phosphoUpase enzymes, and the 
like. In vivo shuffling can be performed utilizing the natural property of cells to recombine 
multnners. While recombination in vivo has provided the major natural route to molecular 
diversity, genetic recombination remains a relatively complex process that involves 1) the 
recognition of homologies; 2) strand cleavage, strand invasion, and metaboUc steps leading to 
the production of recombinant chiasma; and finally 3) the resolution of chiasma into discrete 
recombined molecules. The formation of the chiasma requires the recognition of homologous 
sequences. 

In one aspect, the invention provides a method for producing a hybrid 
polynucleotide firom at least a first polynucleotide and a second polynucleotide. The 
invention can be used to produce a hybrid polynucleotide by introducing at least a first 
polynucleotide and a second polynucleotide which share at least one region of partial 
sequence homology into a suitable host ceU. The regions of partial sequence homology 
promote processes which result in sequence reorganization producing a hybrid 
polynucleotide. The term 'liybrid polynucleotide", as used herein, is any nucleotide sequence 
which results from the method of the present invention and contains sequence from at least 
two original polynucleotide sequences. Such hybrid polynucleotides can result from 
intermolecular recombination evaits which promote sequence integration between DNA 
molecules. In addition, such hybrid polynucleotides can result from intamolecular reductive 
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reassortment processes which utilize repeated sequences to alter a nucleotide sequence within 
a DNA molecule. 

Producing sequence variants 

The invention also provides methods of making sequence variants of the 
nucleic acid and phosphoUpase sequences of the invention or isolating phosphoUpase 
enzyme, e.g., phospholipase, sequence variants using the nucleic acids and polypeptides of 
the invention. In one aspect, the invention provides for variants of an phosphoUpase gene of 
the invention, which can he altered by any means, including, e.g.. random or stochastic 
methods, or, non-stochastic, or "directed evolution," metiiods, as described above. 

The isolated variants may be naturally occurring. Variant can also be created 
in vitro. Variants may be created using genetic engineering techniques such as site directed 
mutagenesis, random chemical mutagenesis, Exonuclease m deletion procedures, and 
standard cloning techniques. Alternatively, such variants, fragments, analogs, or derivatives 
may be created usmg chemical synthesis or modification procedures. Other metiiods of 
making variants are also familiar to tiiose skilled in tiie art. These include procedures in 
which nucleic acid sequences obtained from natural isolates are modified to generate nucleic 
acids which encode polypeptides having characteristics which enhance their value in 
industtial or laboratory appUcations. In such procedures, a large number of variant sequences 
having one or more nucleotide differences witii respect to tiie sequence obtained from tiie 
natiiral isolate are generated and characterized. These nucleotide differences can result in 
amino acid changes witii respect to tiie polypeptides encoded by tiie nucleic acids from tiie 
natural isolates. 

For example, variants may be created using error prone PGR. In error prone 
PGR, PGR is performed under conditions where tiie copying fidelity of tiie DNA polymerase 
is low, such tiiat a high rate of point mutations is obtained along tiie entire lengfli of tiie PGR 
product. Error prone PGR is described, e.g., in Leung, D.W.,etal., Technique. 1:11-15. 
1989) and Galdwell, R. G. & Joyce G.F., PGR Metiiods AppUc, 2:28-33, 1992. Briefly, in 
such procedures, nucleic acids to be mutagenized are mixed witii PGR primers, reaction 
buffer, MgG12, MnG12, Taq polymerase and an appropriate concentration of dNTPs for 
achieving a high rate of point mutation along tiie entire lengtti of tiie PGR product. For 
example, tiie reaction may be performed using 20 finoles of nucleic acid to be mutagenized. 
30pmole of each PGR primer, a reaction buffer comprising 50mM KGl, lOmM Tris HCl (pH 
8.3) and 0.01% gelatin, 7mM MgG12, 0.5mM MnG12, 5 units of Taq polymerase, 0.2mM 
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dGTP, 0.2inM dATP, ImM dCTP, and ImM dTTP. PGR may be perfonned for 30 cycles of 
94° C for 1 min, 45° C for 1 min, and 72° C for 1 min. However, it will be appreciated that 
these parameters maybe varied as appropriate. The mutagenized nucleic acids are cloned 
into an appropriate vector and the activities of the polypeptides encoded by the mutagenized 

nucleic acids is evaluated. 

Variants may also be created using oUgonucleotide directed mutagenesis to 
generate site-specific mutations in any cloned DNA of interest. OUgonucleotide mutagenesis 
is described, e.g., in Reidhaar-Olson (1988) Science 241:53-57. Briefly, in such procedures a 
pluraUty of double stranded oUgonucleotides bearing one or more mutations to be introduced 
into the cloned DNA are synthesized and inserted into the cloned DNA to be mutagenized. 
Clones containing the mutagenized DNA are recovered and the activities of the polypeptides 

they encode are assessed. 

Another method for generating variants is assembly PGR. Assembly PGR 
involves the assembly of a PGR product from a mixture of small DNA fiagments. A large 
number of different PGR reactions occur in parallel in the same vial, with the products of one 
reaction priming tbe products of another reaction. Assembly PGR is described in, e.g.. U.S. 

Patent No. 5,965,408. 

StUl another method of generating variants is sexual PGR mutagenesis. In 
sexual PGR mutagenesis, forced homologous recombination occurs between DNA molecules 
of different but highly related DNA sequence in vitro, as a result of random fragmentation of 
the DNA molecule based on sequence homology, followed by fixation of the crossover by 
primer extension in a PGR reaction. Sexual PGR mutagenesis is described, e.g., in Stemmer 
(1994) Proc. Natl. Acad. Sd. USA 91:10747-10751. Briefly, in such procedures apluraHty 
of nucleic acids to be recombined are digested with DNase to generate fragments having an 
average size of 50-200 nucleotides. Fragments of the desired average size are purified and 
resuspended in a PGR mixture. PGR is conducted under conditions which faciUtate 
recombination between the nucleic acid fragments. For example. PGR may be performed by 
resuspending the purified fragments at a concentration of lO-SOng/pl in a solution of 0.2mM 
of each dNTP, 2.2mM MgGb, 50mM KGL. lOmM Tris HGl, pH 9.0. and 0.1% Triton X-100. 
2.5 units of Taq polymerase per 100:1 of reaction mixture is added and PGR is performed 
using the foUowing regime: 94°G for 60 seconds, 94°G for 30 seconds. 50-55°G for 30 
seconds, 72«'C for 30 seconds (30-45 times) and 72°G for 5 minutes. However, it will be 
^predated that these parameters may be varied as ^ropriate. In some aspects, 

oUgonucleotides may be included in the PGR reactions. In other aspects, the Klenow 
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fragment of DNA polymerase I may be used in a first set of PGR reactions and Taq 
polymerase may be used in a subsequent set of PGR reactions. Rficombinant sequences are 
isolated and the activities of the polypeptides they encode are assessed. 

Variants may also be created by in ^nvo mutagenesis. In some embodiments, 
random mutations in a sequence of interest are generated by propagating the sequence of 
interest in a bacterial strain, such as an ^. coli strain, which carries mutations in one or more 
of the DNA repair pathways. Such "mutator" strains have a higher random mutation rate 
than that of a wUd-type parent. Propagating the DNA in one of these strains will eventually 
generate random mutations within the DNA. Mutator strains suitable for use for in vivo 
mutagenesis are described, e.g., inPCT PubUcationNo. WO 91/16427. 

Variants may also be generated using cassette mutagenesis. In cassette 
mutagenesis a small region of a double stranded DNA molecule is replaced with a synthetic 
oUgonucleotide "cassette" that differs from the native sequence. The oligonucleotide often 
contains completely and/or partially randomized native sequence. 

Recursive ensemble mutagenesis may also be used to generate variants. 
Recursive ensemble mutagenesis is an algorithm for protem engineering (protein 
mutagenesis) developed to produce diverse populations of phenotypically related mutants 
whose members differ in amino acid sequence. This method uses a feedback mechanism to 
control successive rounds of combinatorial cassette mutagenesis. Recursive ensemble 
mutagenesis is described, e.g.. in Aridn (1992) Proc. Natl. Acad. Sci. USA 89:781 1-7815. 

In some embodiments, variants are created using exponential ensemble 
mutagenesis. Exponential ensemble mutagenesis is aprocess for generating combinatorial 
Ubraries wifli a high percentage of unique and functional mutants, wherein small groups of 
residues are randomized in parallel to identify, at each altered position, amino acids which 
lead to functional proteins. Exponential ensemble mutagenesis is described, e.g., in 
Delegrave (1993) Biotechnology Res. 11:1548-1552. Random and site-directed mutagenesis 
are described, e.g., m Arnold (1993) Current Opinion in Biotechnology 4:450-455. 

In some embodiments, the variants are created using shuffling procedures 
wherein portions of a pluraUty of nucleic acids which encode distinct polypeptides are fused 
together to create chimeric nucleic acid sequences which encode chimeric polypeptides as 
described in. e.g., U.S. Patent Nos. 5,965,408; 5,939,250. 

The invention also provides variants of polypeptides of the invention 
comprising sequences in which one or more of the amino acid residues (e.g.. of an exemplary 
polyp^tide of the invention) are substituted with a conserved or non-conserved amino acid 
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residue (e.g., a conserved amino acid residue) and such substituted amino acid residue may or 
may not be one encoded by the genetic code. Conservative substitutions are those that 
substitute a given amino acid in a polypeptide by another amino acid of like characteristics. 
Thus, polypeptides of tiie invention include those with conservative substitutions of 
sequences of the invention, including but not hmited to the following replacements: 
replacements of an aUphatic amino acid such as Alanine. Valine, Leucine and Isoleucine witii 
another ahphatic amino acid; replacement of a Serine with a Threonine or vice versa; 
replacement of an acidic residue such as Aspartic acid and Glutamic acid witii another acidic 
residue; replacement of a residue bearing an amide group, such as Asparagine and Glutamine, 
with another residue bearing an amide group; exchange of a basic residue such as Lysine and 
Arginine with another basic residue; and replacement of an aromatic residue such as 
Phenylalanine, Tyrosine with another aromatic residue. Other variants are those in which one 
or more of the amino acid residues of the polypeptides of the invention includes a substitiient 
group. 

Otiier variants within the scope of the invention are those in which the 
polypeptide is associated with another compound, such as a compound to increase the half- 
Ufe of tiie polypeptide, for example, polyethylene glycol. 

Additional variants within the scope of tiie invention are those m which 
additional amino acids are fiised to tiie polypeptide, such as a leader sequence, a secretory 
sequence, a proprotein sequence or a sequence which faciUtates purification, enrichment, or 

stabilization of the polypeptide. 

In some aspects, tiie variants, fiagments, derivatives and analogs of tiie 
polypeptides of tiie invention retain tiie same biological fimction or activity as tiie exemplary 
polypeptides, e.g., a phospholipase activity, as described herein. In otiier aspects, tiie variant, 
fragment, derivative, or analog includes a proprotein. such tiiat tiie variant, fragment, 
derivative, or analog can be activated by cleavage of tiie proprotein portion tx> produce an 
active polypeptide. 

nptimixing r-nH»Tis to achi e ve high levels of protein expression in host cells 

The invention provides methods for modifying phosphoUpase-encoding 
nucleic acids to modify codon usage. In one aspect, flie invention provides metiiods for 
modifying codons in a nucleic acid encoding a phosphohpase to increase or decrease its 
expression in a host cell. The invention also provides nucleic acids encoding a phosphoUpase 
modified to increase its expression in a host cell, phosphoUpase enzymes so modified, and 
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methods of making the modified phosphoUpase enzymes. The method comprises identifying 
a "non-preferred" or a "less preferred" codon in phosphoUpase-encoding nucleic acid and 
replacing one or more of these non-preferred or less preferred codons wilii a "preferred 
codon" encoding the same amino acid as the replaced codon and at least one non-preferred or 
less preferred codon in the nucleic acid has been replaced by a preferred codon encoding liie 
same amino acid. A preferred codon is a codon over-represented in coding sequences in 
genes in the host cell and a non-preferred or less preferred codon is a codon under- 
represented in coding sequences in genes in the host celL 

Host cells for expressing the nucleic acids, expression cassettes and vectors of 
the invention include bacteria, yeast, fungi, plant cells, insect cells and mammalian cells. 
Thus, the invention provides methods for optimizing codon usage in all of these cells, codon- 
altered nucleic adds and polypeptides made by the codon-altered nucleic acids. Exemplary 
host cells include gram negative bacteria, such as Escherichia coli and Pseudomonas 
fluorescens', gram positive bacteria, such as Streptomyces diversa, Lactobacillus gasseri. 
Lactococcus lactis, Lactococcus cremoris. Bacillus subtUis. Exemplary host cells also 
include eukaryotic organisms, e.g., various yeast, such as Saccharomyces sp., including 
Saccharomyces cerevisiae, Schizosaccharomyces pombe. Pichia pastoris, and Kluyveromyces 
lactis. Hansenula polymorpha. Aspergillus niger, and mammaUan cells and cell lines and 
insect cells and cell lines. Thus, the invention also includes nucleic acids and polypeptides 
optimized for expression in these organisms and species. 

For example, the codons of a nucleic acid encodmg an phospholipase isolated 
ftom a bacterial cell are modified such that the nucleic acid is optimally expressed in a 
bacterial cell different from the bacteria from which the phosphohpase was derived, a yeast, a 
fimgi, a plant cell, an insect cell or a mammalian cell. Methods for optimizmg codons are 
well known in the art, see, e.g.. U.S. Patent No. 5,795,737; Baca (2000) Int. J. Parasitol. 
30:113-118; Hale (1998) Protein Expr. Purif. 12:185-188; Narum (2001) Infect. Immun. 
69:7250-7253. See also Narum (2001) Infect, hnmun. 69:7250-7253. describing optimizing 
codons in mouse systems; Outchkourov (2002) Protein Expr. Purif. 24:18-24, describing 
optimizing codons in yeast; Feng (2000) Biochemistry 39:15399-15409, describing 
optimizing codons in E. coli; Humphreys (2000) Protein Expr. Purif. 20:252-264, describing 
optimizing codon usage that affects secretion in E. coli. 
Transgenic non- ^"""*" anitnals 
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The invention provides transgenic non-human animals comprising a nucleic 
acid, a polypeptide, an expression cassette or vector or a transfected or transformed cell of the 
invention. The transgenic non-human animals can be, e.g., goats, rabbits, sheep, pigs, cows, 
rats and mice, comprising the nucleic acids of the invention. These animals can be used, e.g., 
as in vivo models to study phosphoUpase activity, or, as models to screen for modulators of 
phosphoUpase activity in vivo. The coding sequences for tiie polypeptides to be expressed in 
the transgenic non-human animals can be designed to be constitutive, or, under the control of 
tissue-specific, developmental-specific or inducible transcriptional regulatory factors. 
Transgenic non-human animals can be designed and generated using any method known in 
the art; see. e.g., U.S. Patent Nos. 6,211,428; 6.187.992; 6,156,952; 6,118.044; 6.111,166; 
6,107,541; 5,959,171; 5,922,854; 5,892,070; 5,880,327; 5,891.698; 5.639,940; 5,573,933; 
5,387,742; 5,087,571, describing making and using transformed cells and eggs and transgenic 
mice, rats, rabbits, sheep, pigs and cows. See also, e.g.. Pollock (1999) J. hnmunol. Methods 
23 1 : 147-1 57, describing the production of recombinant proteins in the milk of transgenic 
dairy animals; Baguisi (1999) Nat. Biotechnol. 17:456-461, demonstrating the production of 
transgenic goats. U.S. Patent No. 6,211,428, describes making and using ti-ansgenic non- 
human mammals which express in their brains a nucleic acid construct comprising a DNA 
sequence. U.S. Patent No. 5.387,742. describes injecting cloned recombinant or synthetic 
DNA sequences into fertilized mouse eggs, implanting tiie injected eggs in pseudo-pregnant 
females, and growing to tenn transgenic mice whose cells express proteins related to the 
pathology of Alzhehnet's disease. U.S. Patent No. 6,187,992, describes making and using a 
transgenic mouse whose genome comprises a disruption of the gene encoding amyloid 

precursor protein (APP). 

**Knockout animals" can also be used to practice the methods of the invention. 
For example, in one aspect, the ti^genic or modified animals of the invention comprise a 
"knockout animal," e.g., a 'loiockout mouse," engineered not to express or to be unable to 
express a phosphoUpase. 

Trang penic Plants and Seeds 

The invention provides transgenic plants and seeds comprising a nucleic acid, 
a polypeptide (e.g.. a phosphoUpase), an expression cassette or vector or a transfected or 
transformed cell of the invention. The invention also provides plant products, e.g., oils, 
seeds, leaves, extracts and the Uke, conq)rising a nucleic acid and/or a polypeptide (e.g.. a 
phosphoUpase) of the invention. The transgenic plant can be dicotyledonous (a dicot) or 
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monocotyledonous (a monocot). The invention also provides methods of making and using 
these transgenic plants and seeds. The transgenic plant or plant cell expressmg a polypeptide 
of the invention may be constructed in accordance with any method known in the art. See, 
for example, U.S. Patent No. 6,309,872. 

Nucleic acids and expression constructs of the invention can be introduced 
into a plant cell by any means. For example, nucleic acids or expression constracts can be 
introduced into the genome of a desired plant host, or, the nucleic acids or expression 
constructs can be episomes. Introduction into the genome of a desired plant can be such tiiat 
the host's phosphoUpase production is regulated by endogenous transcriptional or 
translational control elements. The invention also provides 'Tcnockout plants" where 
insertion of gene sequence by, e.g., homologous recombination, has disrupted the expression 
of the endogenous gene. Means to generate 'Tcnockout" plants are well-known in Ihe art, see, 
e.g., Strepp (1998) Proc Natl. Acad. Sci. USA 95:4368-4373; Miao (1995) Plant J 7:359-365. 
See discussion on transgenic plants, below. 

The nucleic acids of the invention can be used to confer desired traits on 
essentially any plant, e.g., on oil-seed containing plants, such as soybeans, rapeseed, 
sunflower seeds, sesame and peanuts. Nucleic acids of the invention can be used to 
manipulate metaboUc pathways of a plant in order to optimize or alter host's expression of 
phosphoUpase. The can change phosphoUpase activity in a plant. Alternatively, a 
phosphoUpase of tiie invention can be used in production of a transgenic plant to produce a 
compound not naturally produced by that plant. This can lower production costs or create a 
novel product. 

In one aspect, the first step in production of a transgenic plant involves making 
an expression construct for expression in a plant cell. These techniques are well known in tiie 
art. They can include selecting and cloning a promoter, a coding sequence for feciUtating 
efficient binding of ribosomes to mRNA and selecting tiie appropriate gene terminator 
sequences. One exemplary constitutive promoter is CaMV35S, ftom tiie cauUflower mosaic 
virus, which generally results in a high degree of expression in plants. Otiier promoters are 
more specific and respond to cues in the plant's internal or extemal environment. An 
exemplary Ught-inducible promoter is tiie promoter &om tiie cab gene, encoding tiie major 

chlorophyll a/b binding protein. 

In one aspect, the nucleic acid is modified to achieve greater expression in a 
plant cell. For example, a sequence of tiie invention is Ukely to have a higher percentage of 
A-T nucleotide pairs compared to tiiat seen in a plant, some of which prefer G-C nucleotide 
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pairs. Therefore, A-T nucleotides in the coding sequence can be substituted \with G-C 
nucleotides without significantly changing the amino acid sequence to enhance production of 

the gene product in plant cells. 

Selectable marker gene can be added to the gene construct in order to identify 

5 plant cells or tissues that have successfully integrated the transgene. This may be necessary 
because achieving incorporation and expression of genes in plant cells is a rare event, 
occurring in just a few percent of the targeted tissues or cells. Selectable marker genes 
encode proteins that provide resistance to agents that are normally toxic to plants, such as 
antibiotics or herbicides. Only plant cells fhsA have integrated the selectable marker gene will 

10 survive when grown on a medium containing the appropriate antibiotic or herbicide. As for 
other inserted genes, marker genes also require promoter and termination sequences for 
proper function. 

In one aspect, making transgenic plants or seeds comprises incorporating 

sequences of the invention and, optionally, marker genes into a target expression construct 

15 (e.g., a plasmid), along with positioning of the promoter and the terminator sequences. This 

can involve transferring the modified gene into the plant through a suitable method. For 

example, a construct may be introduced directly into the genomic DNA of the plant cell using 

techniques such as electroporation and microinjection of plant cell protoplasts, or the 

constructs can be introduced directiy to plant tissue using ballistic methods, such as DNA 

20 particle bombardment. For example, see, e.g., Christou (1997) Plant Mol. Biol. 35:197-203; 

Pawlowski (1996) Mol. BiotechnoL 6:17-30; Klein (1987) Nature 327:70-73; Takumi (1997) 

CJenes Genet. Syst. 72:63-69, discussing use of particle bombardment to introduce transgenes 

into wheat; and Adam (1997) supra, for use of particle bombardment to introduce YACs into 

plant cells. For example, Rinehart (1997) supra, used particle bombardment to generate 

25 transgenic cotton plants. Apparatus for accelerating particles is described U.S. Pat. No. 

5,015.580; and. the commercially available BioRad (BioUstics) PDS-2000 particle 

acceleration instrument; see also, John. U.S. Patent No. 5,608,148; and Ellis. U.S. Patent No. 

5, 681,730, describing particle-mediated tiransformation of gymnosperms. 

In one aspect, protoplasts can be immobilized and injected with nucleic acids, 

30 e.g., an expression construct. Alfliough plant regeneration fi-om protoplasts is not easy with 

cereals, plant regeneration is possible in legumes using somatic embryogenesis firom 

protoplast derived callus. Organized tissues can be transformed with naked DNA using gene 

gun technique, where DNA is coated on timgsten microprojectUes, shot 1/lOOth the size of 

cells, which carry the DNA deep into cells and organelles. Transformed tissue is then induced 
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to regenerate, usuaUy by somatic embryogenesis. This technique has been successful in 
several cereal species iacluding maize and rice. 

Nucleic acids, e.g., expression constructs, can also be introduced in to plant 
cells using recombinant viruses. Plant cells can be transformed using viral vectors, such as, 
e.g., tobacco mosaic virus derived vectors (Rouwendal (1997) Plant Mol. Biol. 33:989-999), 
see Porta (1996) '*Use of viral repUcons for the expression of genes in plants," Mol. 

Biotechnol. 5:209-221. 

Alternatively, nucleic acids, e.g., an expression construct, can be combined 
with suitable T-DNA flanking regions and introduced into a conventional Agrobacterium 
tumefaciens host vector. The virulence functions of the Agrobacterium tumefaciens host will 
direct the insertion of the construct and adj acent marker into the plant cell DNA when the cell 
is infected by the bacteria. Agrobacterium tumefaciens-me^a^ transformation techniques, 
including disarmmg and use of binary vectors, are well described in the scientific Uteratuie. 
See, e.g., Horsch (1984) Science 233:496-498; Fraley (1983) Proc. Natl Acad. Sci. USA 
80:4803 (1983); Gene Transfer to Plants, Potrykus, ed. (Springer-Verlag, Berlin 1995). The 
DNA in an A tumefaciens cell is contained in the bacterial chromosome as weU as in another 
stmcture known as a Ti (tumor-inducing) plasmid. The Ti plasmid contains a stretch of DNA 
temied T-DNA (-20 kb long) that is transferred to the plant cell in the infection process and a 
series of vir (virulence) genes that direct the infection process. A. tumefaciens can only infect 
a plant through wounds: when a plant root or stem is wounded it gives off certain chemical 
signals, in response to which, the vir genes of A tumefaciens become activated and direct a 
series of events necessary for the transfer of the T-DNA ficom the Ti plasmid to the plant's 
chromosome. The T-DNA then enters the plant cell through the wound. One speculation is 
that the T-DNA waits until the plant DNA is being repUcated or transcribed, then inserts itself 
into the exposed plant DNA. InordertouseA tomefflcieiw as a transgene vector, the tumor- 
inducing section of T-DNA have to be removed, while retaining the T-DNA border regions 
and the vir genes. The transgene is then inserted between the T-DNA border regions, where 
it is transferred to the plant cell and becomes mtegrated into the plant's chromosomes. 

The invention provides for the transformation of monocotyledonous plants 
usmg the nucleic acids of the invention, including important cereals, see Hiei (1997) Plant 
Mol. Biol. 35:205-218. See also, eg., Horsch, Science (1984) 233:496; Fraley (1983) Proc. 
Natl. Acad. Sci USA 80:4803; Thj^gaer (1997) supra; Park (1996) Plant Mol. Biol. 
32:1135-1148, discussing T-DNA integration into genomic DNA. See also D'Halluin, U.S. 
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Patent No. 5,712,135, describing a process for the stsible integration of a DNA comprising a 
gene that is functional in a cell of a cereal, or other monocotyledonous plant. 

In one aspect, the third step can involve selection and regeneration of whole 
plants capable of transmittmg the incorporated target gene to the next generation. Such 
regeneration techniques rely on manipulation of certain phytohormones in a tissue culture 
growth medium, typically relying on a biocide and/or herbicide marker that has been 
introduced together with the desired nucleotide sequences. Plant regeneration from cultured 
protoplasts is described in Evans et al.. Protoplasts Isolation and Culture, Handbook of Plant 
Cell Culture, pp. 124-176, MacMillilan Pubhshing Company, New York, 1983; and Binding, 
Regeneration of Plants, Plant Protoplasts, pp. 21-73, CRC Press, Boca Raton, 1985. 
Regeneration can also be obtained from plant callus, explants. organs, or parts thereof. Such 
regeneration techniques are described generally m Klee (1987) Ann. Rev. of Plant Phys. 
38 :467-486. To obtain whole plants from transgenic tissues such as immature embryos, they 
can be grown under controlled environmental conditions in a series of media containing 
nutrients and hormones, a process known as tissue culture. Once whole plants are generated 
and produce seed, evaluation of the progeny begins. 

After the expression cassette is stably incoxporated in transgenic plants, it can 
be mtroduced mto other plants by sexual crossing. Any of a number of standard breeding 
techniques can be used, depending upon the species to be crossed. Since transgenic 
expression of the nucleic acids of the invention leads to phenotypic changes, plants 
comprising the recombinant nucleic acids of the mvention can be sexually crossed with a 
second plant to obtain a jBnal product. Thus, the seed of the invention can be derived fix>m a 
cross between two transgenic plants pf the invention, or a cross between a plant of flie 
invention and another plant. The desired effects (e.g., expression of the polypeptides of the 
invention to produce a plant in which flowering behavior is altered) can be enhanced when 
both parental plants express the polypeptides (e.g., a phosphoUpase) of the mvention. The 
desired effects can be passed to future plant generations by standard propagation means. 

The nucleic acids and polypeptides of the invention are expressed in or 
inserted in any plant or seed. Transgenic plants of the invention can be dicotyledonous or 
monocotyledonous. Examples of monocot transgenic plants of the invention are grasses, 
such as meadow grass (blue grass, Poa), forage grass such as festuca, loUum, temperate 
grass, such as Agrostis, and cereals, e.g., wheat, oats, rye, barley, rice, sorghum, and maize 
(com). Examples of dicot transgenic plants of the invention are tobacco, legumes, such as 
lupins, potato, sugar beet, pea, bean and soybean, and cruciferous plants (family 
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Brassicaceae), such as cauliflower, rape seed, and the closely related model organism 
Arabidopsis thaliana. Thus, the transgenic plants and seeds of the invention include abroad 
range of plants, including, but not hmited to, species from the gGDSst^Anacardium, Arachis, 
Asparagus, Atropa. Avena, Brassica. Citrus. Citrullus, Capsicum, Carthamus. Cocos, Coffea. 
Cucumis, Cucurbita. Daucus, Elaeis. Fragaria. Glycine. Gossypium. Helianthus. 
Heterocallis. Hordeum. Hyoscyamus. Lactuca. Linum. Lolium. Lupinus. Lycopersicon. 
Malus. Manihot. Majorana, Medicago, Nicotiana, Olea, Oryza. Panieum. Pannisetum. 
Persea, Phaseolus. Pistachia. Pisum, Pyrus, Prunus. Raphanus. Ricinus. Secale. Senecio. 
Sinapis. Solanum, Sorghum. Theobromus, Trigonella. Triticum. Vicia. Vitis. Vigna, and Zea. 

Jn alternative embodiments, the nucleic acids of the invention are expressed in 
plants (e.g., as transgenic plants), such as oil-seed containing plants, e.g., soybeans, rapeseed, 
sunflower seeds, sesame and peanuts. The nucleic acids of the invention can be expressed in 
plants which contain fiber cells, including, e,g., cotton, silk cotton tree (Kapok, Ceiba 
pentandra), desert wiUow, creosote bush, winterfat, balsa, ramie, kenaf, hemp, roselle, jute, 
sisal abaca and flax, hi alternative embodiments, the transgenic plants of the mvention can 
be members of the genus Gossypium, including members of any Gossypium species, such as 
G. arboreum;. G. herbaceum, G. barbadense, and G. hirsutum. 

The invention also provides for transgenic plants to be used for producing 
large amounts of the polypeptides (e.g., aphospholipase or antibody) of the invention. For 
example, see Pahngren (1997) Trends Genet. 13:348; Chong (1997) Transgenic Res. 
6:289-296 (produdng human milk protem beta-casern m transgenic potato plants using an 
auxin-inducible, bidirectional mannopine synthase (masr,2') promoter ^lOxAgrobacterium 
flimq/ac/eiw-mediated leaf disc transformation methods). 

Using known procedures, one of skill can screen for plants of the invention by 
detecting the increase or decrease of transgene mRNA or protein in transgenic plants. Means 
for detecting and quantitation of mRNAs or proteins are well known in the art. 

Pnl ypeptides and peptides 

The mvention provides isolated or recombinant polypeptides having a 
sequence identity (e.g., at least 50%, 51%. 52o/o. 53%, 54%, 55%, 56%. 57%, 58%, 59%, 
60%, 61%. 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%. 72%, 73%, 74%, 75%, 
76%, 77%. 78%. 79%. 80%. 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%. 90%. 91%. 
92%, 93%, 94%, 95%, 96%, 97%. 98%, 99%, or more, or complete (100%) sequence 
identity) to an exemplary sequence of the invention, e.g., SEQ ID N0:2, SEQ ID NO:4, SEQ 
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ID NO:6, SEQ ID NO:8, SEQ ID NO: 10, SEQ ID N0:12, SBQ ID NO:14, SEQ ID NO:16, 
SEQ ID NO:18. SEQ ID NO:20, SEQ ID NO:22. SEQ ID NO:24, SEQ ID NO:26, SBQ ID 
NO:28, SEQ ID NO:30, SEQ ID NO:32, SEQ ID NO:34, SEQ ID NO:36, SEQ ID NO:38. 
SEQ ID NO:40, SEQ ID NO:42. SEQ ID NO:44, SEQ ID NO:46, SEQ ID NO:48, SEQ ID 
5 NO:50, SEQ ID NO:52, SEQ ID NO:54, SEQ ID NO:56, SEQ ID NO:58, SEQ ID NO:60, 
SEQ ID NO:62, SEQ ID NO:64, SEQ ID NO:66, SEQ ID NO:68, SEQ ID NO:70, SEQ ID 
NO:72, SEQ ID NO:74, SEQ ID NO:76, SEQ ID NO:78, SEQ ID NO:80, SEQ ID NO:82, 
SEQ ID NO:84, SEQ ID NO:86. SEQ ID NO:88, SEQ ID NO:90, SEQ ID NO:92, SEQ ID 
NO:94, SEQ ID NO:96, SEQ ID NO:98, SEQ ID NO:100, SEQ ID NO:102. SEQ ID 
10 NO:104, SEQ ID NO:106. As discussed above, the identity can be over the full length of the 
polypeptide, or, the identity can be over a subsequence thereof, e.g.. a region of at least about 
50, 60, 70, 80, 90, 100, 150, 200, 250, 300, 350, 400, 450, 500, 550, 600, 650, 700 or more 
residues. Polypeptides of the invention can also be shorter than the full length of exemplary 
polypeptides (e.g., SEQ ID NO:2; SEQ ID NO:4; SEQ ID N0:6; SEQ ID NO:8, etc.). In 
alternative embodiment, the invention provides polypeptides (peptides, fragments) ranging in 
size between about 5 and the full length of a polypeptide, e.g., an enzyme, such as a 
phosphoHpase, e.g., phospholipase; exemplary sizes being of about 5, 10, 15, 20, 25, 30, 35, 
40, 45, 50, 55, 60, 65, 70, 75. 80. 85. 90. 100, 125. 150, 175, 200, 250, 300, 350, 400 or more 
residues, e.g., contiguous residues of the exemplary phosphoUpases of SEQ ID NO:2; SEQ 
ID NO:4; SEQ ID NO:6; SEQ ID NO:8, etc.. Peptides of the invention can be useful as, e.g., 
labeling probes, antigens, toleragens, motifs, phosphoUpase active sites. 

In one aspect, the polypeptide has a phosphoUpase activity, e.g., cleavage of a 
glycerolphosphate ester linkage, the abiUty to hydrolyze phosphate ester bonds, including 
patatin, Upid acyl hydrolase (LAH). phosphoUpase A, B, C and/or phosphoUpase D activity. 
In one aspect, exemplary polypeptides of the invention have a phosphoUpase activity as set 
forth in Table 1, below: 

Table 1 
SBQ ID NO: Enzyme type 



15 



20 
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103, 104 
IL 12 
13, 14 
17, 18 
25,26 
27,28 
33,34 



Patatin | 

Patatin 

Patatin 

Patatin 

Patatin 

Patatin 

Patatin 
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35, 36 
43,44 
45,46 
55,56 
59, 60 
65,66 
71,72 
77, 78 

86, 87 

87, 88 
• 91,92 

95,96 
99, 100 
1,2 
101, 102 
105, 106 
3,4 
31,32 
5,6 
7, 8 
81,82 
89, 90 
9, 10 
93, 94 
97,98 
15, 16 
19,20 
21, 22 
23,24 
29, 30 
37,38 
39, 40 
41,42 
47,48 
49, 50 
51,52 
53,54 
57, 58 
61,62 
63,64 



Patatin 
Patatin 
Patatin 
Patatin 
Patatin 
Patatin 
Patatin 
Patatin 
Patatin 
Patatin 
Patatin 
Patatin 
Patatin 
PLC 
PLC 
PLC 
PLC 
PLC 
PLC 
PLC 
PLC 
PLC 
PLC 
PLC 
PLC 
PLD 
PLD 
PLD 
PLD 
PLD 
PLD 
PLD 
PLD 
PLD 
PLD 
PLD 
PLD 
PLD 
PLD 
PLD 
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67, 68 
71, 72 
73,74 
75, 76 
79, 80 
83,84 



PLD 
PLD 
PLD 
PLD 
PLD 
PLD 



Polypeptides and peptides of the invention can be isolated from natural 
sources, be synthetic, or be recombinantly generated polypeptides. Peptides and proteins can 
be recombinantly expressed in vitro or in vivo. The peptides and polypeptides of the 
invention can be made and isolated using any method known in the art. Polypeptide and 
peptides of the invention can also be synthesized, whole or in part, using chemical methods 
well known in the art. See e.g., Caruthers (1980) Nucleic Acids Res. Symp. Ser. 215-223; 
Horn (1980) Nucleic Acids Res. Symp. Ser. 225-232; Banga, A.K., Ther^eutic Peptides and 
Proteins, Formulation, Processing and DeUvery Systems (1995) Technomic PubUshing Co., 
Lancaster, PA. For example, peptide synthesis can be performed using various soUd-phase 
techniques (see e.g., Roberge (1995) Science 269:202; Merrifield (1997) Methods Enzymol. 
289:30 13) and automated synthesis may be achieved, e.g., using the ABI 431 A Peptide 
Synthesizer (Perkin Ehner) in accordance with the instmctions provided by the manufacturer. 

The peptides and polypeptides of the invention can also be glycosylated. The 
glycosylation can be added post-ti^lationally either chemically or by cellular biosynthetic 
mechanisms, wherein the later incorporates tiie use of known glycosylation motifs, which can 
be native to the sequence or can be added as a peptide or added in the nucleic acid coding 
sequence. The glycosylation can be O-linked or N-linked. 

The peptides and polypeptides of the invention, as defined above, include all 
"mimetic" and **peptidomimetic" forms. The terms "mimetic" and ••peptidomimetic" refer to 
a syntiietic chemical compound which has substantially the same structiiral and/or functional 
characteristics of tiie polypeptides of tiie invention. The mimetic can be either entirely 
composed of syntiietic, non-natural analogues of amino acids, or, is a chimeric molecule of 
partly natural peptide amino acids and partly non-natural analogs of amino acids. The 
mimetic can also incorporate any amount of natural amino acid conservative substitutions as 
long as such substitutions also do not substantially alter the mimetic's structure and/or 
activity. As with polypeptides of the invention which are conservative variants, routine 
experimentation will determine whether a mimetic is within the scope of the invention, i.e.. 
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that its structure and/or function is not substantiaUy altered. Thus, in one aspect, a mimetic 
composition is within the scope of the invention if it has a phosphoHpase activity. 

Polypeptide mimetic compositions of the invention can contain any 
combination of non-natural structural components. In alternative aspect, mimetic 
conq)ositions of the invention include one or all of the following three structural groups: a) 
residue linkage groups other than the natural amide bond ("peptide bond") linkages; b) non- 
natural residues in place of naturally occurring amino acid residues; or c) residues which 
induce secondary structural mimicry. i.e., to induce or stabihze a secondary stmcture, e.g., a 
beta turn, gamma turn, beta sheet, alpha helix conformation, and the Uke. For example, a 
polypeptide of the invention can be characterized as a mimetic when all or some of its 
residues are joined by chemical means other than natural peptide bonds. Individual 
peptidomimetic residues can be joined by peptide bonds, other chemical bonds or couphng 
means, such as, e.g., glutaraldehyde, N-hydroxysuccinimide esters, bifimctional maleimides, 
N,N'-dicyclohexylcarbodiimide (DCC) or N,N'-diisopropylcarbodiimide (DIG). Unking 
15 groups that can be an alternative to the traditional amide bond ("peptide bond") linkages 
include, e.g., ketomethylene (e.g.. -C(=0)-CH2- for -C(=0)-NH-). aminomethylene (CH2- 
NH). ethylene, olefin (CH=CH), ether (CH2-0), thioether (CH2-S), tetrazole (CN4-), 
thiazole, retroamide. thioamide, or ester (see, e.g., Spatola (1983) in Chemistry and 
Biochemistry of Amino Acids, Peptides and Proteins, Vol. 7, pp 267-357, 'Teptide Backbone 
20 Modifications." MarcellDddcer. NY). 

A polypeptide of the invention can also be characterized as a mimetic by 
containing all or some non-natural residues in place of naturally occurring amino acid 
residues. Non-natural residues are well described in the scientific and patent Uterature; a few 
exemplary non-natural compositions usefid as mimetics of natural amino acid residues and 
25 guideUnes are described below. Mimetics of aromatic amino acids can be generated by 

replacing by, e.g., D- or L- naphylalanine; D- or Lr phenylglycine; D- or L-2 tbieneylalanine; 
D- or L-1, -2, 3-, or 4- pyreneylalanine; D- or L-3 tbieneylalanine; D- or L-(2-pyridinyl)- 
alanine; D- or L-(3-pyridinyl)-alanine; D- or L-(2-pyrazinyl)-alanine; D- or M4-isopropyl)- 
phenylglycine; D-(trifluoromethyl)-phenylglycine; D-(trifluoromethyl)-phenylalanine; D-p- 
30 fluoro-phenylalanine; D- or L-p-biphenylphenylalanine; K- or L-p-methoxy- 

biphenylphenylalanine; D- or L-2-mdole(alkyl)alanines; and, D- or L-alkylainines, where 
alkyl can be substituted or unsubstituted methyl, ethyl, propyl, hexyl, butyl, pentyl, isopropyl. 
iso-butyl, sec-isotyl, iso-pentyl, or a non-acidic amino acids. Aromatic rings of a non-natural 



94 



09010-094001 

WO 03/089620 P C T /" U B O 3 /pCT/US03/12556 

amino acid include, e.g., thiazolyl, thiophenyl, pyrazolyl, benzimidazolyl, naphthyl, fiiranyl, 
pyrrolyl, andpyridyl aromatic rings. 

Mimetics of acidic amino acids can be generated by substitution by, e.g., non- 
carboxylate amino acids while maintaining a negative charge; (phosphono)alanine; sulfated 
threonine. Carboxyl side groups (e.g., aspartyl or glutamyl) can also be selectively modified 
by reaction with carbodiimides (R'-N-C-N-R') such as, e.g., l-cyclohexyl-3(2-moipholinyl- 
(4-ethyl) carbodiimide or l-efhyl-3(4-azonia- 4,4- dimetholpentyl) carbodiimide. Aspartyl or 
glutamyl can also be converted to asparagin>d and glutaminyl residues by reaction with 
ammonium ions. Mimetics of basic amino acids can be generated by substitution with, e.g.. 
(in addition to lysine and arginine) the amino adds ornithine, citruUine, or (guamdino)-acetic 
acid, or (guanidino)alkyl-acetic acid, where alkyl is defined above. NitrUe derivative (e.g., 
containing the CN-moiety in place of COOH) can be substituted for asparagine or glutamine. 
Asparaginyl and glutaminyl residues can be deaminated to the corresponding aspartyl or 
glutamyl residues. Arginine residue mimetics can be generated by reacting arginyl with, e.g., 
one or more conventional reagents, including, e.g., phenylglyoxal, 2,3-butanedione, 1,2- 
cyclo-hexanedione, or ninhydrin, preferably under alkaline conditions. Tyrosine residue 
mimetics can be generated by reacting tyrosyl with, e.g., aromatic diazonium compounds or 
tetranitromethane. N-acetylimidizol and tetranitromethane can be used to form O-acetyl 
tyrosyl species and 3-nitro derivatives, respectively. Cysteine residue mimetics can be 
generated by reacting cysteinyl residues with, e.g., alpha-haloacetates such as 2-chloroacetic 
acid or chloroacetamide and corresponding amines; to give caiboxymethyl or 
carboxyamidomethyl derivatives. Cysteine residue mimetics can also be generated by 
reacting cysteinyl residues with. e.g., bromo-trifluoroacetone, alpha-bromo-beta-(5- 
imidozoyl) propionic acid; chloroacetyl phosphate, N-alkyhnaleimides, 3-mtro-2-pyridyl 
disulfide; methyl 2-pyridyl disulfide; p-chloromercuribenzoate; 2-chloromercuii-4 
nitrophenol; or, chloro-7-nitrobenzo-oxa-l,3-diazole. Lysine mimetics can be generated (and 
amino terminal residues can be altered) by reacting lysinyl with, e.g., succinic or otiier 
carboxyUc acid anhydrides. Lysine and other alpha-amino-containing residue mimetics can 
also be generated by reaction with imidoesters, such as methyl picolinimidate, pyridoxal 
phosphate, pyridoxal, chloroborohydride, trinitro-benzenesulfonic acid, O-methyUsourea, 2,4, 
pentanedione, and transamidase-catalyzed reactions with glyoxylate. Mimetics of methionine 
can be generated by reaction with, e.g., methionme sulfoxide. Mimetics of proUne include, 
e.g., pipecoUc acid, thiazoUdine caiboxyUc acid, 3- or 4- hydroxy proline, dehydroproline, 3- 
or 4-meth)^roline, or 3,3,-dimethylproline. ffistidine residue mimetics can be generated by 
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reacting histidyl with, e.g., diethylprocafbonate or para-bromophenacyl bromide. Other 
mimetics include, e.g., those generated by hydroxylation of proline and lysine; 
phosphorylation of the hydroxyl groups of seryl or threonyl residues; methylation of the 
alpha-amino groups of lysine, argmine and histidine; acetylation of the N-terminal amine; 
methylation of main chain amide residues or substitution with N-methyl amino acids; or 
amidation of C-terminal carboxyl groups. 

A residue, e.g., an amino acid, of a polypeptide of the invention can also be 
replaced by an amino acid (or peptidomimetic residue) of the opposite chirality. Thus, any 
amino acid naturally occuiring in the L-configuration (which can also be referred to as the R 
or S, depraiding upon the structure of the chemical entity) can be replaced with flie amino 
acid of the same chemical structural type or a peptidomimetic, but of the opposite chirality, 
referred to as the D- amino acid, but also can be referred to as the R- or S- form. 

The invention also provides methods for modifying the polypeptides of the 
invention by either natural processes, such as post-translational processing (e.g., 
phosphorylation, acylation, etc), or by chemical modification techniques, and the resulting 
modified polypeptides. Modifications can occur anywhere in the polypeptide, including the 
peptide backbone, the amino acid side-chains and the amino or carboxyl termini. It will be 
appreciated that the same type of modification may be present in the same or varying degrees 
at several sites in a given polypeptide. Also a given polypeptide may have many types of 
modifications. Modifications include acetylation, acylation. ADP-ribosylation, amidation, 
covalent attachment of flavin, covalent attachment of a heme moiety, covalent attachment of 
a nucleotide or nucleotide derivative, covalent attachment of a lipid or Upid derivative, 
covalent attachment of a phosphatidylinositol, cross-linking cyclization, disulfide bond 
formation, dranethylation, formation of covalent cross-links, formation of cysteine, formation 
of pyroglutamate, formylation, gamma-carboxylation, glycosylation, GPI anchor formation, 
hydroxylation, iodination, methylation, myristolyation, oxidation, pegylation, proteolytic 
processing, phosphorylation, prenylation, racemization, selenoylation, sulfation, and transfer- 
RNA mediated addition of amino acids to protein such as arginylation. See, e.g., Creighton, 
T.E., Proteins - Structure and Molecular Properties 2nd Ed., W.H. Freeman and Company, 
New York (1993); Posttranslational Covalent Modification of Proteins, B.C. Johnson, Ed., 
Academic Press, New York, pp. 1-12 (1983). 

Solid-phase chemical peptide synthesis methods can also be used to synthesize 
the polypeptide or fi-agments of the invention. Such method have been known in the art since 

the early 1960's (Merrifield, R. B., J. Am. Chem. Soc, 85:2149-2154, 1963) (See also 
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Stewart, J. M. and Young, J. D., Solid Phase Peptide Synthesis, 2nd Ed., Herce Chemical 
Co., Rockford, 111., pp. 11-12)) and have recently been employed in commercially available 
laboratory peptide design and synthesis kits (Cambridge Research Biochemicals). Such 
commercially available laboratory kits have generally utilized the teachings of H. M. Geysen 
et al, Proc. Natl. Acad. Sci.. USA, 81 :3998 (1984) and provide for synthesizing peptides upon 
the tips of a multitude of "rods" or 'Spins'' all of which are connected to a single plate. When 
such a system is utilized, a plate of rods or pins is inverted and inserted into a second plate of 
corresponding weUs or reservoirs, which contain solutions for attaching or anchoring an 
appropriate amino acid to the pin's or rod's tips. By repeating such a process step, i.e., 
inverting and inserting the rod's and pin's tips into appropriate solutions, amino acids are built 
into desired peptides. In addition, a number of available FMOC peptide synthesis systems 
are available. For example, assembly of a polypeptide or fragment can be carried out on a 
solid support using an AppUed Biosystems, Inc. Model 43 lA™ automated peptide 
synthesizer. Such equipment provides ready access to the peptides of the invention, either by 
direct synthesis or by synthesis of a series of fragments that can be coupled using other 
known techniques. 

Phospholipase enzymes 

The invention provides novel phosphoUpases, nucleic acids encoding them, 
antibodies that bind them, peptides representing the enzyme's antigenic sites (epitopes) and 
active sites, and methods for making and using them. In one aspect, polypeptides of the 
invention have a phospholipase activity, as described above (e.g., cleavage of a 
glycerolphosphate ester linkage), hi altemative aspects, the phosphoUpases of the invention 
have activities that have been modified from those of the exemplary phosphoUpases 
described herein. The invention includes phosphoUpases with and without signal sequences 
and the signal sequences themselves. The invention includes immobiUzed phosphoUpases, 
anti-phosphoUpase antibodies and fragments thereof The invention includes 
heterocomplexes, e.g., fusion proteins, hetetodimers, etc., comprising the phosphoUpases of 
the invention. 

Determining peptides representing the enzyme's antigenic sites (epitopes), 
active sites, binding sites, signal sequences, and the Uke can be done by routine screening 
protocols. 

The enzymes of the invention are highly selective catalysts. As with other 
enzymes, they catalyze reactions with exquisite stereo-, regio-, and chemo- selectivities that 
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are unparaUeled in conventional synthetic chemistry. Moreover, the enzymes of the 
invention are remarkably versatile. They can be tailored to function in organic solvents, 
operate at extreme pHs (for example, high pHs and low pHs) extreme temperatures (for 
example, high temperatures and low temperatures), extreme salinity levels (for example, high 
salinity and low salinity), and catalyze reactions witii compounds that are structurally 
unrelated to their natiiral, physiological substrates. Enzymes of the invention can be designed 
to be reactive toward a wide range of natiiral and unnatiiral substrates, flius enabling tiie 
modification of virtually any organic lead compound. Enzymes of the invention can also be 
designed to be highly enantio- and regio-selective. The high degree of functional group 
specificity exhibited by these enzymes enables one to ke^ tirack of each reaction in a 
synthetic sequence leading to a new active compound. Enzymes of the invention can also be 
designed to catalyze many diverse reactions unrelated to flieir native physiological fimction in 
nature. 

The present invention exploits the unique catalytic properties of enzymes. 
Whereas the use of biocatalysts (i.e., purified or crude enzymes, non-Uving or Uving ceUs) in 
chemical transformations normally requires the identification of a particular biocatalyst tiiat 
reacts with a specific starting compound. The present invention uses selected biocatalysts, 
i.e., the enzymes of tiie invention, and reaction conditions that are specific for fimctional 
groups that are present in many starting compounds. Each biocatalyst is specific for one 
fimctional group, or several related fimctional groups, and can react with many starting 
compounds containing tiiis fimctional group. The biocatalytic reactions produce a population 
of derivatives &om a single starting compound. These derivatives can be subjected to another 
round of biocatalytic reactions to produce a second population of derivative compounds. 
Thousands of variations of the original compound can be produced with each iteration of 
biocatalytic derivatization. 

Enzymes react at specific sites of a starting compound without affecting the 
rest of the molecule, a process that is very difficult to achieve using traditional chemical 
methods. This high degree of biocatalytic specificity provides tiie means to identify a single 
active enzyme witiiin a Ubrary. The Hbrary is characterized by tiie series of biocatalytic 
reactions used to produce it. a so-called "biosyntiietic history". Screening tiie Ubrary for 
biological activities and tracing tiie biosyntiietic history identifies tiie specific reaction 
sequence producing tiie active compound. The reaction sequence is repeated and tiie 
structoire of tiie syntiiesized compound determined. This mode of identification, unlike otiier 

syntiiesis and screening approaches, does not require immobilization technologies, and 
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compoimds can be synthesized and tested free in solution using virtuaUy any type of 
screening assay. It is important to note, that the high degree of specificity of enzyme 
reactions on functional groups allows for the "tracking" of specific enzymatic reactions that 
make up the biocatalytically produced Ubrary. 

The invention also provides methods of discovering new phosphoUpases using 
the nucleic acids, polypeptides and antibodies of the invention. In one aspect, lambda phage 
h^raries are screened for expression-based discovery of phosphoUpases. Use of lambda phage 
Ubraries in screening allows detection of toxic clones; improved access to substrate; reduced 
need for engineering a host, by-passing the potential for any bias resulting from mass 
excision of the Ubrarjr, and, fester growth at low clone densities. Screening of lambda phage 
Ubraries can be in liquid phase or in soUd phase. Screening in Uquid phase gives greater 
flexibility in assay conditions; additional substrate flexibiUty; higher sensitivity for weak 
clones; and ease of automation over solid phase screening. 

Many of the procedural steps are performed using robotic automation enabling 
the execution of many thousands of biocatalytic reactions and screening assays per day as 
well as ensuring a high level of accuracy and reproducibiUty (see discussion of arrays, 
below). As a result, a Ubrary of derivative compounds can be produced in a matter of weeks. 
For fijrther teachings on modification of molecules, including smaU molecules, see 
PCTAJS94/09174. 

Phospholipase signal sequences and catalytic domains 

The invention provides phosphoUpase signal sequences (e.g., signal peptides 
(SPs)) and catalytic domams (CDs). The invention provides nucleic acids encoding tiiese 
catalytic domains (CDs) and signal sequences (SPs, e.g., a peptide having a sequence 
comprising/ consisting of amino terminal residues of apolypeptide of the invention). In one 
aspect, the invention provides a signal sequence comprising a peptide comprising/ consisting 
of a sequence as set forth in residues 1 to 20, 1 to 21, 1 to 22, 1 to 23, 1 to 24, 1 to 25, 1 to 26, 
1 to 27, 1 to 28, 1 to 28, 1 to 30, 1 to 31, 1 to 32 or 1 to 33 of a polypeptide of the invention, 
e.g.. SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:6, SEQ ID NO:8, SEQ ID NO:10, SEQ ID 
NO:12. SEQ ID NO:14, SEQ ID NO:16, SEQ ID NO:18, SEQ ID NO:20, SEQ ID NO:22, 
SEQ ID NO:24, SEQ ID NO:26, SEQ ID NO:28, SEQ ID NO:30, SEQ ID NO:32, SEQ ID 
NO:34. SEQ ID NO:36. SEQ ID NO:38, SEQ ID NO:40, SEQ ID NO:42, SEQ ID NO:44, 
SEQ ID NO:46, SEQ ID NO:48, SEQ ID NO:50, SEQ ID NO:52, SEQ ID NO:54, SEQ ID 
NO:56, SEQ ID NO:58, SEQ ID NO:60, SEQ ID NO:62, SEQ ID NO:64, SEQ ID NO:66, 
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SEQ ID NO:68, SEQ ID NO:70, SEQ ID NO:72. SEQ ID NO:74, SEQ ID NO:76, SBQ ID 
NO:78, SEQ ID NO:80, SEQ ID NO:82, SEQ ID NO:84, SEQ ID NO:86, SEQ ID NO:88, 
SEQ ID NO:90, SEQ ID NO:92, SEQ ID NO:94, SEQ ID NO:96, SEQ ID NO:98. SBQ ID 
NO:100. SEQ ID NO:102, SEQ ID NO:104, SEQ ID NO:106. 

Exemplary signal sequences are set forth in the SEQ ID Usting, e.g., residues 1 
to 24 of SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:6; residues 1 to 29 of SEQ ID NO:8; 
residues 1 to 20 of SEQ ID NO:10; residues 1 to 19 of SEQ ID NO:20; residues 1 to 28 of 
SEQ ID NO:22; residues 1 to 20 of SEQ ID NO:32; residues 1 to 23 of SEQ ID NO:38; see 
SEQ ID listing for other exemplary signal sequences of the invention. 

The phospholipase signal sequences of the invention can be isolated peptides, 
or, sequences joined to another phospholipase or anon-phospholipase polypeptide, e.g., as a 
fusion protein. In one aspect, the invention provides polypeptides comprising phosphoUpase 
signal sequences of the invention. In one aspect, polypeptides comprising phospholipase 
signal sequences of the invention comprise sequences heterologous to a phosphoUpase of the 
invention (e.g., a fusion protein comprising a phosphoUpase signal sequence of the invention 
and sequences from another phospholipase or a non-phosphoUpase protein). In one aspect, 
the invention provides phosphoUpases of the invention with heterologous signal sequences, 
e.g., sequences with a yeast signal sequence. A phospholipase of the invention can comprise 
a heterologous signal sequence. e.g.. in a vector, e.g., apPIC series vector (Invitrogen, 
Carlsbad, CA). 

In one aspect, the signal sequences of the invention are identified following 

identification of novel phosphoUpase polypeptides. The pathways by which proteins are 

sorted and transported to their proper cellular location are often referred to as protein 

targeting pathways. One of the most important elements in all of these targeting systems is a 

short amino acid sequence at the amino terminus of a newly synthesized polypeptide called 

the signal sequence. This signal sequence directs a protem to its appropriate location in the 

ceU and is removed during transport or when the protein reaches its final destination- Most 

lysosomal, membrane, or secreted proteins have an amino-terminal signal sequence that 

marks them for translocation into the lumen of the endoplasmic reticulum. More than 100 

signal sequences for proteins in this group have been determined. The signal sequences can 

vary in length from 13 to 36 amino acid residues. Various methods of recognition of signal 

sequences are known to those of skiU in the art. For example, in one aspect, novel 

phosphoUpase signal peptides are identified by a method referred to as SignalP. SignalP uses 

a combined neural network which recognizes both signal peptides and their cleavage sites. 
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(Nielsen, et al., "Identification of prokaryotic and eukaryotic signal peptides and prediction of 
their cleavage sites." Protein Engineering, vol. 10, no. 1, p. 1-6 (1997). 

It should be understood that in some aspects phospholipases of the invention 
may not have signal sequences. In one aspect, the invention provides the phospholipases of 
the invention lacking all or part of a signal sequence. In one aspect, the invention provides a 
nucleic acid sequence encoding a signal sequence fiom one phospholipase operably linked to 
a nucleic acid sequence of a different phosphoUpase or, optionally, a signal sequence firom a 
non-phospholipase protein may be desired. 

The invention also provides isolated or recombinant polypeptides comprising 
signal sequences (SPs) and catalytic domains (CDs) of the invention and heterologous 
sequences. The heterologous sequences are sequences not naturally associated (e.g., to a 
phosphoUpase) with an SP and/or CD. The sequence to which the SP and/or CD are not 
naturally associated can be on the SP's, and/or CD's amino terminal end, carboxy terminal 
end, and/or on both ends of the SP and/or CD. In one aspect, the invention provides an 
isolated or recombinant polypeptide comprising (or consisting of) a polypeptide comprising a 
signal sequence (SP) and/or catalytic domain (CD) of the invention with the proviso that it is 
not associated with any sequence to which it is naturally associated (e.g.. a phosphoUpase 
sequence). Similarly in one aspect, the invention provides isolated or recombinant nucleic 
acids encoding these polypeptides. Thus, in one aspect, the isolated or recombinant nucleic 
acid of the invention comprises coding sequence for a signal sequence (SP) and/or catalytic 
domain (CD) of the invention and a heterologous sequence (i.e., a sequence not naturally 
associated with the a signal sequence (SP) and/or catalytic domain (CD) of the invention). 
The heterologous sequence can be on the 3' terminal end, 5' temiinal end, and/or on both 
ends of the SP and/or CD coding sequence. 

Assa ys for phosphoUpase activity 

The invention provides isolated or recombinant polypeptides having a 
phosphoUpase activity and nucleic acids encoding them. Any of the many phosphoUpase 
activity assays known in the art can be used to detenninine if a polypeptide has a 
phosphoUpase activity and is within the scope of the invention. Routine protocols for 
determining phosphoUpase A, B, D and C, patatin and Upid acyl hydrolase activities are weU 
known in the art 

Exemplary activity assays include turbidity assays, mefliylumbelliferyl 
phosphochoUne (fluorescent) assays, Amplex red (fluorescent) phosphoUpase assays, thin 
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layer chromatography assays (TLC), cytolytic assays and p-nitrophenylphosphorylcholine 
assays. Using these assays polypeptides can be quickly screened for phosphoUpase activity. 

The phosphoUpase activity can comprise a Upid acyl hydrolase (LAH) 
activity. See, e.g., Jimenez (2001) Lipids 36:1169-1174, describing an octaethylene glycol 
monododecyl ether-based mixed micellar assay for determining the lipid acyl hydrolase 
activity of a patatin. Pinsirodom (2000) J. Agric. Food Chem. 48:155-160, describes an 
exemplary lipid acyl hydrolase (LAH) patatin activity. 

Turbidity assays to determine phosphoUpase activity are described, e.g., in 
Kauffinann (2001) "Conversion ot Bacillus thermocatenulatus Upase into an efficient 
phosphoUpase with increased activity towards long-chain fatty acyl substrates by directed 
evolution and rational design," Protein Engineering 14:919-928; Ibrahim (1995) '-Evidence 
impUcating phosphoUpase as a virulence factor of Candida albicanSy" Infect. Immun. 
63:1993-1998. 

MethylumbeUiferyl (fluorescent) phosphocholine assays to determine 
phosphoUpase activity are described, e.g., in Goode (1997) "Evidence for cell surfece and 
internal phosphoUpase activity in ascidian eggs," Develop. Growth Differ. 39:655-660; Diaz 
(1999) "Direct fluorescence-based Upase activity assay," BioTechniques 27:696-700. 

Amplex Red (fluorescent) PhosphoUpase Assays to determine phosphoUpase 
activity are available as kits, e.g., the detection of phosphatidylchoUne-specific phosphoUpase 
using an Amplex Red phosphatidylchoUne-specific phosphoUpase assay kit firom Molecular 
Probes Inc. (Eugene. OR), according to manufacturer's instructions. Fluorescence is 
measured in a fluorescence microplate reader using excitation at 560 ± 10 nm and 
fluorescence detection at 590 ± 10 nm. The assay is sensitive at very low enzyme 
concentrations. 

Thin layer chromatography assays (TLC) to determine phosphoUpase activity 
are described, e.g., in Reynolds (1991) Methods in Enzymol. 197:3-13; Taguchi (1975) 
♦ThosphoUpase from Clostridium novyi type A.I," Biochim. Biophys. Acta 409:75-85. Thin 
layer chromatography (TLC) is a widely used technique for detection of phosphoUpase 
activity. Various modifications of this method have been used to extract the phospholipids 
from the aqueous assay mixtures. In some PLC assays the hydrolysis is stopped by addition 
of chlorofotm/methanol (2:1) to the reaction mixture. The unreacted starting material and the 
diacylglycerol are extracted into the organic phase and may be fractionated by TLC, while 
the head group product remains in the aqueous phase. For more precise measurement of the 
phosphoUpid digestion, radiolabeled substrates can be used (see, e.g., Reynolds (1991) 
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Methods in Enzymol. 197:3-13). The ratios of products and reactants can be used to 
calculate the actual number of moles of substrate hydrolyzed per unit time. If all the 
conq>onents are extracted equaUy, any losses in the extraction will affect all components 
equaUy. Separation of phosphoUpid digestion products can be achieved by siUca gel TLC 
wiUi chloroform/methanol/water (65:25:4) used as a solvent system (see, e.g., Taguchi (1975) 

Biochim. Biophys. Acta 409:75-85). 

p-Nitrophenylphosphorylcholine assays to determine phosphoHpase activity 
are described, e.g., in Korbsrisate (1999) "Cloning and characterization of a nonhemolytic 
phospholipase gene framBurkholderia pseudomallei," J. Clin. Microbiol. 37:3742-3745; 
Berka (1981) "Studies of phosphoUpase (heat labile hemolysin) in Pseudomonas 
aeroginosa," Infect. Immun. 34:1071-1074. This assay is based on enzymatic hydrolysis of 
the substrate analog p-nitrophenylphosphorylcholine to liberate a yellow cbromogenic 
compound p-nitrophenol, detectable at 405 mn. This substrate is convenient for high- 

througlq>ut screening. 

A cytolytic assay can detect phospholipases with cytolytic activity based on 
lysis of erythrocytes. Toxic phosphoUpases can interact with eukaryotic cell membranes and 
hydrolyze phosphatidylcholine and sphingomyelin, leading to cell lysis. See, e.g., TitbaU 
(1993) Microbiol. Rev. 57:347-366. 

Hybrid (rliiTnRric^ phospholioas es and peptide libraries 

In one aspect, the invention provides hybrid phospholipases and fusion 
proteins, including peptide libraries, comprising sequences of the invention. The peptide 
Ubraries of the invention can be used to isolate peptide modulators (e.g., activators or 
inhibitors) of targets, such as phospholipase substrates, receptors, enzymes. The peptide 
Ubraries of tiie invention can be used to identify formal binding partiiers of targets, such as 
Ugands, e.g., cytokines, hormones and the like. In one aspect, the invention provides 
chimeric proteins comprising a signal sequence (SP) and/or catalytic domain (CD) of the 
invention and a heterologous sequence (see above). 

The invention also provides methods for generating "improved" and hybrid 
phospholipases using llie nucleic acids and polypeptides of the invention. For example, the 
invention provides methods for generating enzymes that have activity, e.g., phosphoUpase 
activity (such as, e.g.. phosphoUpase A, B, C or D activity, patatin esterase activity, cleavage 
of a glycerolphosphate ester Unkage, cleavage of an ester Unkage in a phosphoUpid in a 
vegetable oil) at extreme alkaUne pHs and/or acidic pHs, high and low temperatures, osmotic 
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conditions and the like. The invention provides methods for generating hybrid enzymes (e-g., 

hybrid phospholipases). 

In one aspect, the methods of the invention produce new hybrid polypeptides 
by utilizing cellular processes that integrate the sequence of a first polynucleotide such that 
i resulting hybrid polynucleotides encode polypeptides demonstrating activities derived &om 
the first biologically active polypeptides. For example, the first polynucleotides can be an 
exemplary nucleic acid sequence (e.g., SEQ ID NO:l. SEQ ID NO:3, SEQ ID NO:5, SEQ ID 
NO:7, etc.) encoding an exemplary phosphoUpase of the invention (e.g., SEQ ID NO:2, SEQ 
ID NO:4, SEQ ID NO:6, SEQ ID NO:8, etc.). The first nucleic acid can encode an enzyme 
3 from one organism that functions effectively under a particular environmental condition, e.g. 
high salinity. It can be "integrated" with an enzyme encoded by a second polynucleotide 
from a different organism that fimctions effectively under a different environmental 
condition, such as extaremely high temperatures. For example, when the two nucleic acids 
can produce a hybrid molecule by e.g., recombination and/or reductive reassortment. A 
5 hybrid polynucleotide containing sequences from the first and second original 

polynucleotides may encode an enzyme that exhibits characteristics of both enzymes encoded 
by the original polynucleotides. Thus, the enzyme encoded by the hybrid polynucleotide may 
fimction effectively under environmental conditions shared by each of the enzymes encoded 
by the first and second polynucleotides, e.g., high salinity and extreme temperatures. 
JO Alternatively, a hybrid polypeptide resulting from tins metiiod of die invention 

may exhibit specialized enzyme activity not displayed in the original enzymes. For example, 
following recombination and/or reductive reassortment of polynucleotides encoding 
phospholipase activities, the resulting hybrid polypeptide encoded by a hybrid polynucleotide 
can be screened for specialized activities obtained firom each of flie original enzymes, i.e. tiie 
25 type of bond on which the phosphoUpase acts and flie temperature at which the phospholipase 
functions. Thus, for example, tiie phosphoUpase may be screened to ascertain tiiose chemical 
fimctionaUties which distinguish tiie hybrid phosphoUpase from tiie original phospholipases, 
such as: (a) amide (peptide bonds), i.e., phosphoUpases; (b) ester bonds, i.e., amylases and 
Upases; (c) acetals, i.e., glycosidases and, for example, tiie temperahjre, pH or salt 
30 concentration at which tiie hybrid polypeptide functions. 

Sources of the polynucleotides to be "integrated" witii nucleic acids of tiie 
invention may be isolated from individual organisms ("isolates"), collections of organisms 
tiiat have been grown in defined media ("enrichment cultures"), or, uncultivated organisms 
("environmental sanq)les"). The use of a culture-mdependent approach to derive 
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polynucleotides encoding novel bioactivities fitom environmental samples is most preferable 
since it allows one to access untapped resources of biodiversity. "Environmental Ubraries" 
are generated from environmental samples and represent the collective genomes of naturally 
occurring organisms archived in cloning vectors that can be propagated in suitable 
prokaryotic hosts. Because the cloned DNA is initially extracted directly from environmental 
samples, the libraries are not limited to the small fraction of prokaryotes that can be grown in 
pure culture. Additionally, a normalization of the enviromnental DNA present in these 
samples could allow more equal representation of the DNA from all of the species present in 
the original sample. This can dramatically increase the efficiency of finding interestmg genes 
from minor constituents of the sample that may be under-represented by several orders of 
magnit ude compared to the dominant species. 

For example, gene Ubraries generated &om one or more uncultivated 
microorganisms are screened for an activity of interest. Potential pathways encoding 
bioactive molecules of interest are first captured in prokaryotic cells in the form of gene 
expression Ubraries. Polynucleotides encoding activities of interest are isolated fi:om such 
Ubraries and introduced into a host cell. The host ceU is grown under conditions that promote 
recombioation and/or reductive reassortment creating potentially active biomolecules with 

novel or enhanced activities. 

The microorganisms &om which hybrid polynucleotides may be prepared 
include prokaryotic microorganisms, such as Eubacteria md Archaebacteria, and lower 
eukaryotic microorganisms such as fimgi, some algae and protozoa. Polynucleotides may be 
isolated from environmental samples. Nucleic acid may be recovered without culturing of an 
organism or recovered &om one or more cultured organisms. In one aspect, such 
microorganisms may be extremophiles, such as hyperthennophiles, psychrophiles, 
psychrotrophs, halophiles, barophiles and addophiles. M one aspect, polynucleotides 
encoding phospholipase enzymes isolated &om extremophiUc microorganisms are used to 
make hybrid enzymes. Such enzymes may function at temperatures above 100°C in, e.g., 
terrestrial hot springs and deep sea thermal vents, at temperatures below in, e.g.. arctic 
waters, in the saturated salt environment of. e.g., the Dead Sea, at pH values around 0 in, e.g., 
coal deposits and geothennal sulfur-rich springs, or at pH values greater than 11 in, e.g., 
sewage sludge. For example, phosphoUpases cloned and expressed from extremophiUc 
organisms can show high activity throughout a wide range of tenq»eratures and pHs. 
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Polynucleotides selected and isolated as described herein, including at least 
one nucleic acid of the invention, are introduced into a suitable host cell. A suitable host cell 
is any cell that is capable of promoting recombination and/or reductive reassortment. The 
selected polynucleotides can be in a vector that includes ^propriate control sequences. The 
host cell can be a higher eukaryotic cell, such as a mammalian cell, or a lower eukaryotic cell, 
such as a yeast cell, or preferably, the host cell can be a prokaryotic ceU, such as a bacterial 
cell. Introduction of the construct into the host cell can be effected by calcium phosphate 
transfection, DEAE-Dextran mediated transfection, or electroporation (Davis et al., 1986). 

As representative examples of appropriate hosts, there may be mentioned: 
bacterial cells, such as E. coli, Streptomyces, Salmonella typhimurium; fungal cells, such as 
yeast; insect cells such as Drosophila S2 and Spodoptera S£9; animal cells such as CHO, 
COS or Bowes melanoma; adenoviruses; and plant cells. The selection of an ^propriate host 
for recombination and/or reductive reassortment or just for expression of recombinant protein 
is deemed to be within the scope of those skilled in the art from the teachings herein. 
Mammalian cell culture systems that can be employed for recombination and/or reductive 
reassortment or just for expression of recombinant protein include, e.g., the COS-7 lines of 
monkey kidney fibroblasts, described in "SV40-transfonned simian cells support the 
repUcation of early SV40 mutants" (Gluzman, 1981), the C127, 3T3, CHO, HeLa and BHK 
cell Imes. MammaUan expression vectors can comprise an origin of replication, a suitable 
promoter and enhancer, and necessary ribosome binding sites, polyadenylation site, spUce 
donor and acceptor sites, transcriptional termination sequences, and 5' flanking non- 
transcribed sequences. DNA sequences derived from the SV40 spUce. and polyadenylation 
sites may be used to provide the required non-transcribed genetic elements. 

Host ceUs containing the polynucleotides of interest (for recombination and/or 
reductive reassortment or just for expression of recombinant protein) can be cultured in 
conventional nutrient media modified as appropriate for activating promoters, selecting 
transformants or amplifying genes. The culture conditions, such as temperature, pH and the 
like, are those previously used with the host cell selected for expression, and will be apparent 
to the ordinarily skilled artisan. The clones which are identified as having the specified 
enzyme activity may then be sequenced to identify the polynucleotide sequence encoding an 
enzyme having the enhanced activity. 

In another aspect, the nucleic acids and methods of the present invention can 
be used to generate novel polynucleotides for biochemical pathways, e.g., pathways from one 
or more operons or gene clusters or portions thereof. For example, bacteria and many 
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eukaryotes have a coordinated mechanism for regulating genes whose products are involved 
in related processes. The genes are clustered, in structures referred to as "gene clusters," on a 
single chromosome and are transcrihed together under the control of a single regulatory 
sequence, including a single promoter which initiates transcription of the entire cluster. Thus, 
a gene cluster is a group of adjacent genes that are either identical or related, usually as to 
fheir fimction. 

Gene cluster DNA can be isolated from different organisms and Ugated into 
vectors, particularly vectors containing expression regulatory sequences which can control 
and regulate the production of a detectable protein or protdn-related array activity from the 
ligated gene clusters. Use of vectors which have an exceptionally large capacity for 
exogenous DNA introduction are particularly ^propriate for use with such gene clusters and 
are described by way of example herein to include the f-factor (or fertiUty factor) ofE. coli. 
This f-factor of E, coli is a plasmid which affects high-frequency transfer of itself during 
conjugation and is ideal to achieve and stably propagate large DNA fragments, such as gene 
clusters from mixed microbial samples. "Fosmids," cosmids or bacterial artificial 
chromosome (BAG) vectors can be used as cloning vectors. These are derived from E. coli 
f-factor which is able to stably integrate large segments of genomic DNA. When integrated 
with DNA from a mixed uncultured environmental sample, this makes it possible to achieve 
large genomic fragments in the form of a stable "environmental DNA Ubrary." Cosmid 
vectors were originally designed to clone and propagate large segments of genomic DNA. 
Cloning into cosmid vectors is described in detail in Sambrook et aL, Molecular Cloning: A 
Laboratory Manual, 2nd Ed., Cold Spring Harbor Laboratory Press (1989). Once ligated into 
an appropriate vector, two or more vectors contaming different polyketide synthase gene 
clusters can be introduced into a suitable host cell. Regions of partial sequence homology 
shared by the gene clusters will promote processes which result in sequence reorganization 
resulting in a hybrid gene cluster. The novel hybrid gene cluster can then be screened for 
enhanced activities not found in Ibe original gene clusters. 

Thus, in one aspect, the invention relates to a method for producing a 
biologicaUy active hybrid polypeptide using a nucleic acid of the invention and screening the 
polypeptide for an activity (e.g., enhanced activity) by: 

(1) introducing at least a first polynucleotide (e.g., a nucleic acid of the 
invention) in operable linkage and a second polynucleotide in operable linkage, said at least 
first polynucleotide and second polynucleotide sharing at least one region of partial sequence 
homology, into a suitable host cell; 
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(2) growing the host cell under conditions which promote sequence 
reorganization resulting in a hybrid polynucleotide in operable linkage; 

(3) expressing a hybrid polypeptide encoded by the hybrid polynucleotide; 

(4) screening the hybrid polypeptide under conditions which promote 
identification of the desired biological activity (e.g., enhanced phosphoUpase activity); and 

(5) isolating the a polynucleotide encoding the hybrid polypeptide. 
Methods for screening for various enzyme activities are known to those of 

skill in the art and are discussed throughout the present specification. Such methods may be 
employed when isolating the polypeptides and polynucleotides of the invention. 

In vivo reassortment can be focused on "inter-molecular" processes 
collectively referred to as "recombination." In bacteria it is generally viewed as a *'RecA- 
dependent" phenomenon. The invention can rely on recombmation processes of a host cell to 
recombine and re-assort sequences, or the ceUs' abiUty to mediate reductive processes to 
decrease the complexity of quasi-repeated sequences in the cell by deletion. This process of 
"reductive reassortment" occurs by an "intra-molecular", RecA-independent process. Thus, 
in one aspect of the invention, using the nucleic acids of the invention novel polynucleotides 
are generated by the process of reductive reassortment The method involves the generation 
of constructs containing consecutive sequences (original encoding sequences), their insertion 
into an appropriate vector, and their subsequent introduction into an appropriate host cell. 
The reassortment of the individual molecular identities occurs by combinatorial processes 
between the consecutive sequences in the construct possessing regions of homology, or 
between quasi-repeated units. The reassortment process recombmes and/or reduces the 
complexity and extent of the repeated sequences, and results in the production of novel 
molecular species. 

Various treatments may be appUed to enhance the rate of reassortment These 
could include treatment with ultra-violet Ught, or DNA damaging chemicals, and/or the use 
of host cell hnes displaying enhanced levels of "genetic instabiUty". Thus the reassortment 
process may mvolve homologous recombination or the natural property of quasi-repeated 
sequences to direct their own evolution. 

Repeated or "quasi-repeated" sequences play a role in genetic instabiUty. 
"Quasi-repeats" are repeats that are not restricted to their original unit structure. Quasi- 
repeated units can be presented as an array of sequences in a construct; consecutive units of 
sinular sequences. Once Ugated, the junctions between the consecutive sequences become 

essentially invisible and the quasi-repetitive nature of the resulting construct is now 
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continuous at the molecular level. The deletion process the cell performs to reduce the 
complexity of the resulting construct operates between the quasi-repeated sequences. The 
quasi-repeated units provide a practically limitless repertoke of templates upon which 
sUppage events can occur. The constructs containing the quasi-repeats thus effectively 
provide sufficient molecular elasticity that deletion (and potentially insertion) events can 
occur virtually anywhere within the quasi-r^etitive units. When the quasi-repeated 
sequences are all ligated in the same orientation, for instance head to tail or vice versa, the 
cell cannot distinguish individual units. Consequently, the reductive process can occur 
throughout the sequences. In contrast, when for example, the units are presented head to 
head, rather than head to tail, the inversion delineates the endpoints of the adjacent unit so 
that deletion fomiation will favor the loss of discrete units. Thus, in one aspect of the 
invention, the sequences to be reassorted are in the same orientation. Random orientation of 
quasi-repeated sequences wUl result in the loss of reassortment efficiency, while consistent 
orientation of the sequences wiU offer the highest efficiency. However, while having fewer 
of the contiguous sequences in the same orientation decreases the efficiency, it may still 
provide sufficient elasticity for the effective recovery of novel molecules. Constmcts can be 
made with the quasi-repeated sequences in the same orientation to allow higher efficiency. 

Sequences can be assembled in a head to tail orientation using any of a variety 
of methods, including the following: a) Primers that include a poly-A head and poly-T tail 
which when made single-stranded would provide orientation can be utiUzed. This is 
acconq>Ushed by having the first few bases of the primers made from RNA and hence easily 
removed RNaseH. b) Primers that include unique restriction cleavage sites can be utilized. 
Multiple sites, a battery of unique sequences, and repeated synthesis and Ugation steps would 
be required, c) The inner few bases of the primer could be thiolated and an exonuclease used 

to produce properly tailed molecules. 

The recovery of the re-assorted sequences relies on the identification of 
cloning vectors with a reduced repetitive index (SI). The re-assorted encoding sequences can 
then be recovered by amplification. The products are re-cloned and expressed. The recovery 
of cloning vectors with reduced RI can be affected by: 1) The use of vectors only stably 
maintained when the construct is reduced m complexity. 2)The physical recovery of 
shortened vectors by physical procedures. In this case, the cloning vector would be recovered 
using standard plasmid isolation procedures and size fiactionated on either an agarose gel, or 
column with a low molecular weight cut offutilizing standard procedures. 3) The recovery 
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of vectors containing interrupted genes which can be selected when insert size decreases. 4) 
The use of direct selection techniques with an expression vector and the appropriate selection. 
/ Encoding sequences (for example, genes) from related organisms may 

demonstrate a high degree of homology and encode quite diverse protein products. These 
types of sequences are particularly useful in the present invention as quasi-repeats. However, 
this process is not limited to such nearly identical repeats. 

The following is an exemplary method of the invention. Encoding nucleic 
acid sequences (quasi-repeats) are derived from three (3) species, including a nucleic acid of 
the invention. Each sequence encodes a protein with a distinct set of properties, including an 
enzyme of the invention. Each of the sequences differs by a single or a few base pairs at a 
unique position in the sequence. The quasi-repeated sequences are separately or collectively 
amplified and ligated into random assembUes such that all possible permutations and 
combinations are available in the population of Ugated molecules. The number of quasi- 
repeat units can be controlled by the assembly conditions. The average number of quasi- 
repeated units in a construct is defined as the repetitive index (RI). Once formed, the 
constructs may, or may not be size fractionated on an agarose gel according to pubUshed 
protocols, inserted into a cloning vector, and transfected into an appropriate host cell. The 
cells are then propagated and "reductive reassortment" is effected. The rate of the reductive 
reassortment process maybe stimulated by the introduction of DNA damage if desired. 
Whether the reduction in RI is mediated by deletion formation between repeated sequences 
by an "intra-molecular" mechanism, or mediated by recombination-like events through 
'Mnter-molecular" mechanisms is immaterial. The end result is a reassortment of the 
molecules into all possible combinations. In one aspect, the method comprises the additional 
step of screening the Ubrary members of the shufQed pool to identify individual shuffled 
Ubrary members having the abiHty to bind or otherwise interact, or catalyze a particular 
reaction (e.g., such as catalytic domain of an enzyme) with a predetermined macromolecule, 
such as for example a proteinaceous receptor, an oUgosaccharide, virion, or other 
predetermined compound or structure. The polypeptides, e.g., phosphoUpases, that are 
identified from such Hbraries can be used for various purposes, e.g., the industrial processes 
described herein and/or can be subjected to one or more additional cycles of shuffling and/or 
selection. 

In another aspect, it is envisioned that priior to or during recombination or 

reassortment, polynucleotides generated by the method of the invention can be subjected to 

agents or processes which promote the introduction of mutations into the original 
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polynucleotides. The introduction of such mutations would increase the diversity of resulting 
hybrid polynucleotides and polypeptides encoded therefix)m. The agents or processes which 
promote mutagenesis can include, but are not Umited to: (+)-CC-1065, or a synthetic analog 
such as (+).CC-1065-(N3-Adenine (See Sun and Hurley, (1992); an N-acetylated or 
deacetylated 4'-fluro-4-aminobiphenyl adduct capable of inhibiting DNA synthesis (See , for 
example, van de Poll et al. (1992)); or aN-acetyiated or deacetylated 4-aminobiphenyl adduct 
capable of inhibiting DNA synthesis (See also, van de Poll et al. (1992), pp. 751-758); 
trivalent chromium, a trivalent chromium salt, apolycycUc aromatic hydrocarbon (PAH) 
DNA adduct capable of inhibiting DNA repUcation, such as 7-bromomethyl- 
benz[a]anthracene ("BMA"), tris(2.3-dibromopropyl)phosphate CTris-BP"). l>dibromo-3- 
chloropropane ("DBCP"), 2-bromoacrolein (2BA), benzo[a]pyrene-7,8-dihydrodiol-9-10- 
epoxide ("BPDE"), aplatinum(II) halogen salt, N-hydroxy-2-amino-3-methyliniidazo[4,5-fI- 
quinoline ("N-hydroxy-IQ"), and N-hydroxy-2-amino-l-methyl-6-phenylimidazo[4,5-fI- 
pyridine C*N-hydroxy-PhIP")- Especially preferred means for slowing or halting PGR 
ampUfication consist of UV light (+)-CC-1065 and (+)-CC-1065-(N3-Adenine). Particularly 
encompassed means are DNA adducts or polynucleotides comprising the DNA adducts from 
the polynucleotides or polynucleotides pool, which can be released or removed by a process 
including heating Ihe solution comprising the polynucleotides prior to further processing. 

Screening Methodologies and "On-l ine" Monitoring Devices 

In practicing the methods of the invention, a variety of ^paratus and 
methodologies can be used to in conjunction with the polypeptides and nucleic acids of the 
invention, e.g., to screen polypeptides for phosphohpase activity, to screen compounds as 
potential modulators of activity (e.g., potentiation or inhibition of enzyme activity), for 
antibodies that bind to a polypeptide of the invention, for nucleic acids that hybridize to a 
nucleic acid of the invention, and the like. 

Immobilized Enzyme Solid Siq>ports 

The phosphohpase enzymes, fragments thereof and nucleic acids that encode 
the enzymes and fragments can be affixed to a soUd support. This is often economical and 
efficient in tiie use of the phosphoUpases in industiial processes. For exanq>le, a consortium 
or cocktail of phosphohpase enzymes (or active fragments thereof), which are used in a 
specific chemical reaction, can be attached to a soUd support and dunked into a process vat 
The enzymatic reaction can occur. Then, the sohd support can be taken out of the vat, along 
with the enzymes affixed thereto, for repeated use. In one embodiment of tiie invention, an 
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isolated nucleic acid of the invention is affixed to a solid support. In another embodiment of 
the invention, the soKd support is selected from the group of a gel, a resin, a polymer, a 
ceramic, a glass, a microelectrode and any combination thereof. 

For exanq>le, soM supports useful in this invention include gels. Some 
examples of gels include Sepharose, gelatin, glutaraldehyde, chitosan-treated glutaraldehyde, 
albumin-glutaraldehyde, chitosan-Xanthan, toyopearl gel (polymer gel), alginate, alginate- 
polylysine, cairageenan, agarose, glyoxyl agarose, magnetic agarose, dextran-agarose, 
poly(Carbamoyl Sulfonate) hydrogel, BSA-PEGhydrogel, phosphorylated polyvinyl alcohol 
(PVA), monoaminoethyi-N-aminoethyl (MANA), amino, or any combination thereof 

Another soUd support useful in the present invention are resins or polymers. 
Some examples of resins or polymers include cellulose, acrylamide. nylon, rayon, polyester, 
anion-exchange resin, AMBERLITE™ XAD-7, AMBERLITB™ XAD-8, AMBERLITE™ 
IRA-94. AMBERLITE™ IRC-50, polyvinyl, polyacryUc, polymethacrylate, or any 

combination thereof. 

Another type of soM support usefiil in the present invention is ceramic. Some 
examples include non-porous ceramic, porous ceramic, SiOj, AI2O3. Another type of soUd 
support usefiil in the present invention is glass. Some examples include non-porous glass, 
porous glass, aminopropyl glass or any combination thereof Another type of soUd support 
that can be used is a microelectrode. An example is a polyethyleneimine-coated magnetite. 
Gn^hitic particles can be used as a solid support. 

Another example of a soUd support is a cell, such as a red blood cell. 

Methods of immobilization 

There are many methods that would be known to one of skill in the art for 
immobilizing enzymes or fragments thereof, or nucleic acids, onto a soM support. Some 
examples of such methods include, e.g., electrostatic droplet generation, electrochemical 
means, via adsorption, via covalent binding, via cross-linking, via a chemical reaction or 
process, via encapsulation, via entr^ment, via calcium alginate, or via poly (2-hydroxyethyl 
methacrylate). like methods are described in Methods in Enzymology, ImmobiUzed 
Enzymes and Cells, Part C. 1987. Academic Press. Edited by S. P. Colowick and N. O. 
Kaplan. Volume 136; and Immobilization of Enzymes and Cells. 1997. Humana Press. 
Edited by G. F. Bickerstaff. Series: Methods in Biotechnology, Edited by J. M. Walker. 

Capillary Arrays 



112 



09010-094001 

WO 03/089620 P C T X U S O 3 /pCT/US03/12556 

Capillary arrays, such as the GIGAMATRK™, Diversa Corporation, San 
Diego, CA, can be used to in the methods of the invention. Nucleic acids or polypeptides of 
the invention can be immobiUzed to or appUed to an array, including capiUary arrays. Arrays 
can be used to screen for or monitor Ubraries of compositions (e.g., small molecules, 
antibodies, nucleic acids, etc.) for their abiUty to bind to or modulate the activity of a nucleic 
add or a polypeptide of the mvention. C^illary arrays provide another system for holding 
and screening samples. For example, a sample screening apparatus can include a pluraUty of 
c^illaries formed into an array of adjacent capiUaries, wherein each capillary comprises at 
least one wall defining a lumen for retaining a sample. The ^paratus can further include 
interstitial material disposed between adjacent capillaries in the array, and one or more 
reference indicia formed within of the interstitial material. A capillary for screening a 
sample, wherein the capillary is adapted for being bound in an array of cq)illaries. can 
include a first wall defining a lumen for retaining the sample, and a second wall formed of a 
filtering material, for filtering excitation energy provided to flie lumen to excite the sample. 

A polypeptide or nucleic acid, e.g., a Ugand, can be introduced into a first 
component into at least a portion of a capUlary of a capillary array. Each capillary of the 
capillary array can comprise at least one wall defining a lumen for retaining the first 
component. An air bubble can be introduced into the capillary behind the first component. A 
second component can be introduced into the capillary, wherem the second component is 
separated from the first component by the air bubble. A sample of interest can be introduced 
as a first Uquid labeled with a detectable particle into a capillary of a capillary array, wherein 
each capillary of the capillary array comprises at least one wall defining a lumen for retaining 
the first Uquid and the detectable particle, and wherein the at least one wall is coated with a 
binding material for binding the detectable particle to the at least one wall. The method can 
fiirther include removing the first Uquid &om the c^illary tube, wherein the bound detectable 
particle is maintained within the capillary, and introducing a second Uquid into the c^illary 
tube. 

The capillary array can include a plurahty of individual capillaries comprising 

at least one outer wall defining a lumen. The outer wall of the capillary can be one or more 

walls fiised together. Similarly, the waU can define a lumen that is cyUndrical, square, 

hexagonal or any other geometric shape so long as the wall^ form a lumen for retention of a 

Uquid or sample. The c^iUaries of the capmaiy array can be held together in close 

proximity to form a planar structure. The c^illaries can be bound together, by being fixsed 

f e g where the capiUaries are made of glass), glued, bonded, or clamped side-by-side. The 
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capiUary airay can be formed of any number of individual capillaries, for example, a range 
from 100 to 4,000,000 capillaries. A capillary array can form a microtiter plate having about 
100,000 or more individual capillaries bovmd together. 

Arrays, or "BioChips" 

Nucleic acids or polypeptides of the invention can be immobilized to or 
appUed to an array. Arrays can be used to screen for or monitor Ubraries of compositions 
(e.g., small molecules, antibodies, nucleic acids, etc.) for their abiUty to bind to or modulate 
the activity of a nucleic acid or a polypeptide of the invention. For example, in one aspect of 
the invention, a monitored parameter is transcript expression of a phosphoUpase gene. One 
or more, or, all the transcripts of a cell can be measured by hybridization of a sample 
comprising transcripts of the cell, or, nucleic acids representative of or complementary to 
transcripts of a cell, by hybridization to immobiUzed nucleic acids on an array, or "biochip." 
By using an "array" of nucleic acids on a microchip, some or all of the transcripts of a cell 
can be simultaneously quantified. Alternatively, arrays comprismg genomic nucleic acid can 
also be used to determine the genotype of anewly engineered strain made by the methods of 
the invention. 'Tolypeptide arrays" can also be used to simultaneously quantify aplurahty of 
proteins. 

The present invention can be practiced with any known "array," also referred 
to as a "mictoairay or "nucleic acid array" or "polypeptide array" or "antibody array" or 
"biochip," or variation thereof Arrays are generically a pluraUty of "spots" or "target 
elements," each target element comprising a defined amount of one or more biological 
molecules, e.g., oUgonucleotides, immobiUzed onto a defined area of a substrate surface for 
specific binding to a sample molecule, e.g., mRNA transcripts. 

In practicing the methods of the invention, any known array and/or method of 
making and using arrays can be incorporated in whole or in part, or variations thereof, as 
described, for example, in U.S. Patent Nos. 6,277,628; 6,277,489; 6.261,776; 6,258,606; 
6,054.270; 6,048.695; 6.045,996; 6,022,963; 6,013,440; 5,965,452; 5,959,098; 5,856.174; 
5,830,645; 5,770,456; 5.632.957; 5,556,752; 5,143,854; 5.807,522; 5,800,992; 5,744,305; 
5',700,637; 5,556.752; 5.434.049; see also. e.g.. WO 99/51773; WO 99/09217; WO 97/46313; 
WO 96/17958; see also. e.g.. Johnston (1998) Curr. Biol. 8:R171-R174; Schummer (1997) 
Biotechniques 23:1087-1092; Kern (1997) Biotechniques 23:120-124; Solinas-Toldo (1997) 
Genes, Chromosomes & Cancer 20:399-407; Bowtell (1999) Nature Genetics Supp. 21:25- 
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32. See also pubUshed U.S. patent appUcations Nos. 20010018642; 20010019827; 
20010016322; 20010014449; 20010014448; 20010012537; 20010008765. 

Antibodies and Antibodv-hased scre -ftninp methods 

The invention provides isolated or recombinant antibodies that specifically 
bind to a phosphoUpase of the invention. These antibodies can be used to isolate, identify or 
quantify the phosphoUpases of the invention or related polypeptides. These antibodies can be 
used to inhibit the activity of an enzyme of the invention. These antibodies can be used to 
isolated polypeptides related to those of the invention, e.g., related phosphoUpase enzymes. 
The antibodies can be used in immunoprecipitation, staining (e.g.. FACS), immunoafBnify 
columns, and the like. If desired, nucleic acid sequences encoding for specific antigens can 
be generated by immunization followed by isolation of polypeptide or nucleic acid, 
amplification or cloning and immobilization of polypeptide onto an array of the invention. 
Alternatively, the methods of the invention can be used to modify the structure of an antibody 
produced by a cell to be modified, e.g., an antibody's affinity can be increased or decreased. 
Furthermore, the abiUfy to make or modify antibodies can be a phenotype engineered into a 

cell by the methods of the invention. 

Methods of immunization, producing and isolating antibodies (polyclonal and 
monoclonal) are known to tiiose of skill in the art and described in the scientific and patent 
Uteratiire, see, e.g., CoUgan, CURRENT PROTOCOLS IN IMMUNOLOGY, Wiley/Qreene, 
NY (1991); Stites (eds.) BASIC AND CLINICAL IMMUNOLOGY (7tii ed.) Lange Medical 
PubUcations, Los Altos, CA ("Stites"); Goding, MONOCLONAL ANTIBODIES: 
PRINCIPLES AND PRACTICE (2d ed.) Academic Press, New York, NY (1986); Kohler 
(1975) Nature 256:495; Harlow (1988) ANTIBODIES, A LABORATORY MANUAL, Cold 
Spring Harbor Publications, New York. Antibodies also can be generated in vitro, e.g., using 
recombinant antibody binding site expressing phage display libraries, in addition to the 
traditional in vivo methods using animals. See, e.g., Hoogenboom (1997) Trends Biotechnol. 
15:62-70; Katz (1997) Annu. Rev. Biophys. Biomol. Struct. 26:27-45. 

The polypeptides can be used to generate antibodies which bind specifically to 
the polypeptides of tiie invention. The resulting antibodies may be used in immunoaffinify 
chromatography procedures to isolate or purify the polypeptide or to determine wheflier flie 
polypeptide is present in a biological sample. In such procedures, a protein preparation, such 
as an extract, or a biological sanq)le is contacted with an antibody capable of specifically 
binding to one of the polypeptides of the invention. 
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In immunoaffinity procedures, the antibody is attached to a solid support, such 
as a bead or other column matrix. The protein preparation is placed in contact with the 
antibody under conditions in which the antibody specifically binds to one of the polypeptides 
of the invention. After a wash to remove non-specifically bound proteins, the specificaUy 
bound polypeptides are eluted. 

The ability of proteins in a biological sample to bind to the antibody may be 
determined using any of a variety of procedures familiar to those skilled in the art. For 
example, binding may be determined by labeling the antibody with a detectable label such as 
a fluorescent agent, an enzymatic label, or a radioisotope. Alternatively, binding of the 
antibody to the sample may be detected using a secondary antibody having such a detectable 
label thereon. Particular assays include ELISA assays, sandwich assays, radioimmunoassays, 
and Westem Blots. 

Polyclonal antibodies generated against the polypeptides of the invention can 
be obtained by direct injection of the polypeptides into an animal or by administering the 
polypeptides to an animal, for example, a nonhuman. The antibody so obtained will then 
bind the polypeptide itself, hi this manner, even a sequence encoding only a fragment of the 
polypeptide can be used to generate antibodies which may bind to the whole native 
polypeptide. Such antibodies can then be used to isolate the polypeptide from cells 
expressing that polypeptide. 

For preparation of monoclonal antibodies, any technique which provides 
antibodies produced by continuous cell line cultures can be used. Examples include the 
hybridoma technique, the trioma technique, the human B-cell hybridoma technique, and the 
EBV-hybridoma technique (see. e.g.. Cole (1985) in Monoclonal Antibodies and Cancer 
Therapy, AlanR. Liss, Inc., pp. 77-96). 

Techniques described for the production of single chain antibodies (see, e.g., 
U.S. Patent No. 4,946,778) can be adapted to produce single chain antibodies to the 
polypeptides of the invention. Alternatively, transgenic mice may be used to express 
humanized antibodies to these polypeptides or fragments thereof. 

Antibodies generated against the polypeptides of the invention may be used in 
screening for similar polypeptides from other organisms and samples. In such techniques, 
polypeptides from the organism are contacted with the antibody and those polypeptides 
which specifically bind the antibody are detected. Any of the procedures described above 
may be used to detect antibody binding. 
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Kits 

The invention provides kits comprising the compositions, e.g., nucleic acids, 
expression cassettes, vectors, cells, polypeptides (e.g., phosphoUpases) and/or antibodies of 
the invention. The kits also can contain instructional material teaching the methodologies 
and industrial uses of tibie invention, as described herein. 

Industrial ajid Medical Uses of the E n^vmes of the Invention 

The invention provides many industrial uses and medical applications for the 
enzymes of the invention, e.g., phosphoUpases A, B, C and D, including converting a non- 
hydratable phospholipid to a hydratable form, oil degumming, processing of oils fiom plants, 
fish, algae and the like, to name just a few applications. Methods of using phosphoUpase 
enzymes in industrial applications are well known in the art. For example, the 
phosphoUpases and methods of the invention can be used for the processing of fats and oils as 
described, e.g.. in JP Patent AppUcation Publication H6-306386, describing converting 
phosphoUpids present in the oils and fats into water-soluble substances containing phosphoric 
acid groups. 

PhosphoUpases of flie invention can be used to process plant oils and 
phosphoUpids such as those derived fiwm or isolated from soy, canola, palm, cottonseed, 
corn, pahn kernel, coconut, peanut, sesame, sunflower. PhosphoUpases of the invention can 
be used to process essential oils, e.g., those fiom fiuit seed oils, e.g., grapeseed, apricot, 
borage, etc. PhosphoUpases of the invention can be used to process oils and phosphoUpids in 
different forms, including crude forms, degummed, gums, wash water, clay, siUca, soapstock, 
and the Uke. The phospholipids of the invention can be used to process high phosphorous 
oils, fish oils, animal oils, plant oils, algae oils and the Uke. hi any aspect of the invention, 
any time a phosphoUpase C can be used, an alternative comprises use of a phosphoUpase D of 
the invention and a phosphatase (e.g., using a PLD/ phosphatase combination to improve 
yield in a high phosphorus oU, such as a soy bean oil). 

PhosphoUpases of the invention can be used to process and make edible oils, 
biodiesel oils, Uposomes for pharmaceuticals and cosmetics, structured phosphoUpids and 
structured Upids. PhosphoUpases of the invention can be used in oil extraction. 

PhosphoUpases of the invention can be used to process and make various 

soaps. 

Caustic refining 
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In one exemplary process of the invention, phosphoUpases are used as caustic 
refining aids. More particularly a PLC or PLD and a phosphatase are used in the processes as 
adrop-in, either before, during, or after a caustic neutralization refining process (either 
continuous or batch refining. The amount of enzyme added may vary according to the 
process. The water level used in the process should be low, e.g., about 0.5 to 5%. 
Alternatively, caustic is be added to the process multiple times. In addition, the process may 
be performed at different temperatures (25'»C to TO'C), with different acids orcaustics, and at 
varying pH (4-12). Acids that may be used in a caustic refining process include, but are not 
limited to, phosphoric, citric, ascorbic, sulfimc, fiimaric, maleic, hydrochloric and/or acetic 
acids. Acids are used to hydrate non-hydratablephosphoUpids. Caustics that may be used 
include, but are not limited to, KOH- and NaOH. Caustics are used to neutralize firee fetty 
acids. Alternatively. phosphoUpases, or more particularly a PLC or a PLD and a 
phosphatase, are used for purification of phytosterols &om the gum/soapstock. 

An alternate embodiment of the invention to add the phospholipase before 
caustic refining is to express the phosphoUpase in aplant. In another embodiment, the 
phosphoUpase is added during crushing of the plant, seeds or other plant part. Alternatively, 
the phosphoUpase is added following crushing, but prior to refining (i.e. in holding vessels). 
In addition, phosphoUpase is added as a refining pre-treatment. either with or without acid. 

Another embodiment of the invention, akeady described, is to add the 
phosphoUpase during a caustic refining process. In this process, the levels of Scid and 
caustic are varied depending on the level of phosphorous and the level of firee fatty acids. In 
addition, broad temperature and pH ranges are used in the process, dependent upon the type 
of enzyme used. 

In another embodiment of the invention, the phosphoUpase is added after 
caustic refining (Fig. 9). hi one instance, the phosphoUpase is added in an intense mixer or in 
a retention mixer, prior to separation. Alternatively, the phospholipase is added foUowing the 
heat step. In another embodiment, the phosphoUpase is added in the centrifiigation step. In 
an additional embodiment, the phosphoUpase is added to the soapstock. Altematively, the 
phosphoUpase is added to the washwater. In another instance, the phosphoUpase is added 
during the bleaching and/or deodorizing steps. 

Oil degumming and vegetable oil processing 

The phosphoUpases of the invention can be used in various vegetable oil 
processmg steps, such as in vegetable oil extraction, particularly, in the removal of 



118 



09010-094001 ^ 

WO 03/089620 p C U B O 3 /pCTAJS03/12556 

**phospholipid gums" in a process caUed "oil degumming," as described above. The 
invention provides methods for processing vegetable oils from various sources, such as 
soybeans, lapeseed, peanuts and other nuts, sesame, sunflower, pahn and com. The methods 
can used in conjunction with processes based on extraction with as hexane, with subsequent 
refining of the crude extracts to edible oils, including use of the methods and enzymes of the 
mvention. The first step in the refining sequence is the so-called "degumming" process, 
which serves to separate phosphatides by the addition of water. The material precipitated by 
degumming is separated and fiirfher processed to mixtures of lecithins. The commercial 
lecithins, such as soybean lecithin and sunflower lecilhin, are semi-soUd or very viscous 
materials. They consist of a mixture of polar Upids, mainly phospholipids, and oil, mainly 
triglycerides. 

The phosphoUpases of the invention can be used in any "degumming" 
procedure, including water degumming, ALCON oil degumming (e.g., for soybeans), safinco 
degumming, "super degumming," UF degumming. TOP degumming, uni-degumming, dry 
degumming andENZYMAX™ degumming. See, e.g., U.S. Patent Nos. 6,355,693; 
6,162,623; 6,103,505; 6,001,640; 5,558,781; 5,264,367. Various "degumming" procedures 
incorporated by the metiiods of tiie invention are described in Bockisch, M. (1998) hi Fats 
and Oils Handbook, The extortion of Vegetable Oils (Ch^ter 5), 345-445, AOCS Press, 
Chan^aign, niiuois. The phosphoUpases of the invention can be used in the industrial 
^Ucation of enzymatic degumming of triglyceride oils as described, e.g., inEP 513 709. 

The phosphoUpases of the mvention can be used in the mdustrial ^Ucation 
of enzymatic degummmg as described, e.g., in CA 1 102795, which describes a method of 
isolating polar Upids from cereal Upids by the addition of at least 50% by weight of water. 
This method is a modified degunnning in the sense fliat it utiUzes the principle of adding 

water to a crude oil mixture. 

In one aspect, the invention provides enzymatic processes comprising use of 
phosphoUpases of the invention (e.g., a PLC) comprising hydrolysis of hydrated 
phospholipids in oil at a temperature of about 20''C to 40°C, at an alkaUne pH, e.g., a pH of 
about pH 8 to pH 10, using a reaction time of about 3 to 10 minutes. This can result in less 
tiian 10 ppm final oil phosphorus levels. The invention also provides enzymatic processes 
comprising use of phosphoUpases of the mvention (e.g., a PLC) comprising hydrolysis of 
hydratable and non-hydratable phosphoUpids in oU at a tenq)erature of about 50°C to 60°C, at 
a pH sUghtiy below neutiral. e.g., of about pH 5 to pH 6.5, using a reaction time of about 30 to 

60 minutes. This can result in less than 10 ppm final oU phosphorus levels. 
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la one aspect, the invention provides enzymatic processes that utilize a 
phosphoHpase C enzyme to hydrolyze a glyceryl phosphoester bond and thereby enable the 
return of the diacylglyceride portion of phosphoUpids back to the oil, e.g.. a vegetable, fish or 
algae oil (a •'phosphoUpase C (PLC) caustic refining aid"); and, reduce the phospholipid 
content in a degumming step to levels low enough for high phosphorous oils to be physicaUy 
refined ( a "phosphoUpase C (PLC) degumming aid"). The two ^roaches can generate 
diff^nt values and have dififerent target supplications. 

In various exenq)lary processes of the invention, a number of distinct steps 
compose llie degumming process preceding the core bleaching and deodorization refining 
processes. These steps include heating, mixing, holding, separating and drying. Following 
the heating step, water and often add are added and mixed to allow the insoluble 
phosphoUpid "gum" to agglomerate into particles which maybe separated. While water 
separates many of the phosphatides in degunnning, portions of the phospholipids are non- 
hydratable phosphatides (NHPs) present as calcium or magnesium salts. Degumming 
processes address these NHPs by the addition of acid. Following the hydration of 
phosphoUpids, the oil is mixed, held and separated by centrifiigation. Finally, the oil is dried 
and stored, shipped or refined, as illustrated, e.g., in Figure 6. The resulting gums are either 
processed fiirther for lecithin products or added back into the meal. 

In various exemplary processes of the invention phosphorous levels are 
reduced low enough for physical refining. The separation process can result in potentially 
higher yield losses than caustic refining. Additionally, degumming processes may generate 
waste products that may not be sold as commercial lecithin, see, e.g.. Figure 7 for an 
exemplary degumming process for physically refined oils. Therefore, these processes have 
not achieved a significant share of the market and caustic refining processes continue to 
dominate the industry for soy, canola and sunflower. Note however, tiiat a phosphoUpase C 
enzyme employed in a special degumming process would decrease gum formation and return 
the diglyceride portion of the phosphoUpid back to the oil. 

In one aspect, a phosphoUpase C enzyme of the invention hydrolyzes a 
phosphatide at a glyceryl phosphoester bond to generate a diglyceride and water-soluble 
phosphate confound. The hydrolyzed phosphatide moves to the aqueous phase, leaving the 
diglyceride in the oU phase, as illustrated in Figure 8. One objective of the PLC "Caustic 
Refining Aid" is to convert the phosphoUpid gums fonned during neutraUzation into a 
diacylglyceride that wiU migrate back into the oil phase. In contrast, one objective of the 
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"PLC Degumming Aid" is to reduce the phosphoUpids in crude oil to a phosphorous 
equivalent of less than 10 parts per million (ppm). 

In one aspect, a phosphoUpase C enzyme of the invention wiU hydrolyze the 
phosphatide from both hydratable and non-hydratable phosphoUpids in neutralized crude and 
5 degummed oils before bleaching and deodorizing. The target enzyme can be appUed as a 
drop-in product in the existing caustic neutraHzation process, as illustrated in Figure 9. In 
this aspect, the enzyme will not be required to withstand extreme pH levels if it is added after 

the addition of caustic. 

In one aspect, a phosphoUpase of the invention enables phosphorous to be 

10 removed to the low levels acceptable in physical refining. In one aspect, a PLC of the 
invention wiU hydrolyze the phosphatide from both hydratable and non-hydratable 
phospholipids in crude oils before bleaching and deodorizing. The target enzyme can be 
^Ued as a drop-in product in the existing degumming operation, see. e.g.. Figure 10. Given 
sub-optimal mixing in commercial equipment, it is Ukely that acid wiU be required to bring 

15 the non-hydratable phosphoUpids in contact with the enzyme at the oil/water interface. 
Therefore, in one aspect, an acid-stable PLC of the invention is used. 

In one aspect, a PLC Degumming Aid process of the invention can eUminate 
losses in one, or aU three, areas noted in Table 2. Losses associated in a PLC process can be 
estimated to be about 0.8% versus 5.2% on a mass basis due to removal of tiie phosphatide. 

20 



Table 2: Losses Addressed by PLC Products 



V) Oa lost in gum formation & separatioii 2. 1% 


Caustic Refining Aid 
X 


Degumming Aid 
X 


2) Saponified oil in caustic addition 3.1% 

3) Oil trapped in clay in bleaching* 

<1.0% . 

Total Yield Loss -5.2% 


X 
-2.1% 


X 
X 

-5.2% 



Additional potential benefits of this process of the invention include the following: 

♦ Reduced adsorbents - less adsorbents required witii lower (< 5ppm) phosphorous 

♦ Lower chemical usage - less chemical and processing costs associated with hydration 



of non-hydratable phosphoUpids 
♦ Lower waste generation - less water required to remove phosphorous &om oil 
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Oils processed (e.g., "degununed") by the methods of the invention include 
plant oilseeds, e.g., soybean oil, rapeseed oil and sunflower oil. In one aspect, the "PLC 
Caustic Refining Aid" of the invention can save 1.2% over existing caustic refining 
processes. The refining aid appUcation addresses soy oil that has been degummed for lecithin 
and these are also excluded &om the value/load calculations. 

Performance targets of the processes of the invention can vary according to the 
appUcations and more specifically to the point of enzyme addition, see Table 3. 





Caustic Refining Aid 


Degummins Aid 


Ihcoming Oil Phosphorous Levels 


<200 ppm* 


600-1,400 ppm 


Final Oil Phosphorous Levels 


<10ppm^ 


<10ppm 


Hydratable & Non-hydratable gums 


Yes 


Yes 


Residence Time ^ 


3-10 minutes 


30 minutes* 


Liauid Formulation 


Yes 


Yes 


Target pH 


8-10^^* 


5.0-5.5" 


Target Temperature 


20-40°C 


^50-60°C 


Water Content 


<5% 


1-1.25% 


Enzyme Formulation Purity 


No lipase/protease^ 


No lipase/protease 


Other Kev Reauirements 


Removal of Fe 


Removal of Fe 


*Water degummed oil . ^ • 
^Target levels achieved in upstream caustic neutralization step but must be mamtainea 

^1-2 hours existing tr o a r -^^ 
^*Acid degumming will require an enzyme that is stable in much more acidic condiUons: pH at 2.3 for citric 

acid at 5%. ('-RoehmUSPN 6,001,640)^ ^ „ 
f^^ThepH of neutralized oil is NOT neutral Testing atPOS indicates that thepH will be in the alkalme range 
from 6,5-10 (December P. 2002). I\^ical pH range needs to be determined. , 
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Other processes that can be used with a phosphoUpase of the invention, e.g., a 
10 phosphoUpase Ai can convert non-hydratable native phospholipids to a hydratable form. In 
one aspect, the enzyme is sensitive to heat. This maybe desirable, since heating the oil can 
destroy the enzyme. However, the degumming reaction must be adjusted to pH 4-5 and 60°C 
to accommodate this enzyme. At 300 Units/kg oil saturation dosage, this exemplary process 
is successfiil at taking previously water-degummed oil phosphorous content down to <10 
ppm P. Advantages can be decreased H2O content and resultant savings in usage, handling 
and waste. Table 4 Hsts exemplary applications for industrial uses for enzymes of the 
invention: 

Table 4: Exemplary Application 



Caustic Refining Aid 



Degumming Aid 



Soy oil w/ lecithin production 



X 



Chemical refined soy oil. Sunflower oil, 
Canolaoil 



X 
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Low phosphatide oils (e.g. palm) 



X 



In addition to these various "degumming" processes, the phosphoUpases of the 
invention can be used in any vegetable oil processing step. For example, phospholipase 
enzymes of the invention can be used in place of PLA, e.g., phosphoUpase A2, in any 
5 vegetable oil processing step. Oils that are "processed" or "degummed" in the methods of the 
invention include soybean oils, rapeseed oils, com oils, oil from pahn kernels, canola oils, 
sunflower oils, sesame oils, peanut oils, and the Uke. The main products from this process 

include triglycerides. 

In one exemplary process, when the enzyme is added to and reacted with a 
10 crude oil, the amount of phosphoUpase employed is about 10-10,000 units, or, alternatively, 
about, 100-2,000 units, per 1 kg of crude oil. The enzyme treatment is conducted for 5 min to 
10 hours at a temperature of 30°C to 90«C, or, alternatively, about, 40°C to 70°C. The 
conditions may vary depending on the optimum temperature of the enzyme. The amount of 
water added to dissolve the enzyme is 5-1,000 wt parts per 100 wt. parts of crade oil, or, 
15 alternatively, about, 10 to 200 wt. parts per 100 wt. parts of crude oil. 

Upon completion of such enzyme treatinent, the enzyme Uquid is separated 
witii an appropriate means such as a centrifugal separator and tiie processed oil is obtained. 
Phosphorus-containing compounds produced by enzyme decomposition of gummy 
substances in such a process are practically all transfeired into the aqueous phase and 
20 removed from the oil phase. Upon completion of the enzyme treatinent, if necessary, tiie 

processed oil can be additionally washed with water or organic or inorganic acid such as, e.g., 
acetic acid, phosphoric aci4 succinic acid, and the like, or with salt solutions. 

In one exemplary process for ultra-filtration degumming, the enzyme is bound 
to a filter or the enzyme is added to an oil prior to filti»tion or the enzyme is used to 

25 periodically clean filters. 

In one exemplary process for a phospholipase-mediated physical refining aid, 
water and enzyme are added to crude oil. In one aspect, aPLC or aPLD and a phosphatase 
are used in the process. In phosphoUpase-mediated physical refining, tiie water level can be 
low, i.e. 0.5 - 5% and tiie process time should be short Cess tiian 2 hours, or, less tiian 60 

30 minutes, or, less tiian 30 minutes, or, less tiian 1 5 minutes, or, less tiian 5 minutes). The 
process can be run at different temperatiires (25°C to 70°C), using different acids and/or 
caustics, at different pHs (e.g., 3-10). 
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In alternate aspects, water degumming is performed first to coUect lecithin by 
centrifugation and then PLC or PLC and PLA is added to remove non-hydratable 
phosphoUpids (the process should be performed under low water concentration). In another 
aspect, water degumming of crude oil to less than 10 ppm (edible oils) and subsequent 
physical refining (less than 50 ppm for biodiesel) is performed. In one aspect, an emulsifier 
is added and/or the crude oil is subjected to an intense mixer to promote mixing. 
Alternatively, an emulsion-breaker is added and/or the crude oil is heated to promote 
separation of the aqueous phase. In anolher aspect, an add is added to promote hydration of 
non-hydratable phosphoUpids. Additionally, phosphoUpases can be used to mediate 
purification of phytosterols &am the gum/soapstock. 

The enzymes of the invention can be used in any oil processing method, e.g., 
degumming or equivalent processes. For example, the enzymes of the invention can be used 
in processes as described in U.S. Patent Nos. 5,558,781; 5,264,367; 6,001,640. The process 
described in USPN 5,558,781 uses either phosphoUpase Al, A2 or B, essentially breaking 
down lecithin in the oil that behaves as an emulsifier. 

The enzymes and methods of the invention can be used in processes for the 
reduction of phosphorus-containing components in edible oils comprising a high amount of 
non-hydratable phosphorus by using of a phosphoUpase of the invention, e.g., a polypeptide 
having a phosphoUpase A and/or B activity, as described, e.g., in EP Patent Number: EP 
0869167. In one aspect, the edible oil is a crude oil, a so-called *'non-degummed oil." In one 
aspect, the method treat a non-degummed oil. including pressed oils or extracted oils, or a 
mixture thereof, fiom, e.g., rapeseed, soybean, sesame, peanut, com or sunflower. The 
phosphatide content in a crude oil can vary fix>m 0.5 to 3% w/w corresponding to a 
phosphorus content in the range of 200 to 1200 ppm, or, in the range of 250 to 1200 ppm. 
Apart from the phosphatides, the crude oil can also contains small concentrations of 
carbohydrates, sugar compounds and metal/phosphatide acid complexes of Ca, Mg and Fe. In 
one aspect, the process comprises treatment of a phosphoUpid or lysophosphoUpid with the 
phosphoUpase of the invention so as to hydrolyze fatty acyl groups. In one aspect, the 
phosphoUpid or lysophosphoUpid comprises lecithin or lysolecithin. In one aspect of the 
process the edible oU has a phosphorus content fix)m between about 50 to 250 ppm, and the 
process comprises treating the oil with a phosphoUpase of the invention so as to hydrolyze a 
major part of the phosphoUpid and separating an aqueous phase containing the hydrolyzed 
phosphoUpid fix>m the oiL In one aspect, prior to the enzymatic degumming process the oil is 

water-degummed. In one aspect, the methods provide for the production of an animal feed 
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comprising mixing the phosphoUpase of the invention with feed substances and at least one 
phospholipid. 

The enzymes and methods of the invention can be used ia processes of oil 
degununing as described, e.g., in WO 98/18912. The phospholipases of the invention can be 
used to reduce the content of phosphoUpid in an edible oil. The process can comprise 
treating the oil with a phosphoUpase of the invention to hydrolyze a major part of the 
phosphoUpid and separating an aqueous phase containing the hydrolyzed phosphoUpid ftom 
the oil. This process is appUcable to the purification of any edible oil, which contains a 
phosphoUpid, e.g. vegetable oils, such as soybean oil, rapeseed oil and sunflower oil, fish 
oils, algae and animal oils and the Uke. Prior to the enzymatic treatment, the vegetable oil is 
preferably pretreated to remove slime (mucilage), e.g. by wet refining. The oil can contain 
50-250 ppm of phosphorus as phosphoUpid at the start of the treatment with phosphoUpase, 
and the process of the invention can reduce this value to below 5-10 ppm. 

The enzymes of the invention can be used in processes as described in JP 
/^pUcation No.: H5-132283, filed April 25, 1993, which comprises a process for the 
purification of oils and fats comprising a step of converting phosphoUpids present in die oils 
and fats into water-soluble substances containing phosphoric acid groups and removing them 
as water-soluble substances. An enzyme action is used for the conversion into water-soluble 
substances. An enzyme having aphosphoUpase C activity is preferably used as tiie enzyme. 

The enzymes of the invention can be used in processes as described as the 
"Organic Refining Process," (ORP) (IPH, Omaha, NE) which is a method of refining seed 
oils. ORP may have advantages over tiaditional chemical refining, including improved 
refined oil yield, value added co-products, reduced capital costs and lower environmental 
costs. 

The enzymes of the invention can be used in processes for the treatinent of an 
oil or fat, animal or vegetal, raw, semi-processed or refined, comprising adding to such oil or 
fet at least one enzyme of the invention that aUows hydrolyzing and/or depolymerizing the 
non-glyceridic compounds contained in the oil, as described, e.g., in EP AppUcation number: 
82870032.8. Exemplary methods of the invention for hydrolysis and/or depolymerization of 
non-glyceridic compounds in oils are: 

1) The addition and mixture in oils and fats of an enzyme of the invention or enzyme 

complexes previously dissolved in a smaU quantity of appropriate solvent (for 

example water). A certain number of solvents are possible, but a non-toxic and 

suitable solvent for the enzyme is chosen. This addition may be done in processes 
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with successive loads, as weU as in continuous processes. The quantity of enzyme(s) 
necessary to be added to oils and fats, according to this process, may range, 
depending on the enzymes and the products to be processed, from 20 to 400 ppm, i.e., 
from 0.02 kg to 0.4 kg of enzyme for 1000 kg of oil or fat, and preferably from 20 to 
100 ppm, i.e., from 0.02 to 0.1 kg of enzyme for 1000 kg of oil, these values being 
understood to be for concentrated enzymes, i.e., without diluent or solvent. 

2) Passage of the oil or fet through a fixed or insoluble filtering bed of enzyme(s) of the 
invention on soUd or semi-soUd supports, preferably presenting a porous or fibrous 
structure. Jn this technique, the enzymes are trapped in the micro-cavities of the 
porous or fibrous structure of the svq>ports. These consist, for example, of resins or 
synthetic polymers, ceUulose carbonates, gels such as agarose, filaments of polymers 
or copolymers with porous structure, trapping small droplets of enzyme in solution in 
their cavities. Concerning the enzyme concentration, it is possible to go up to the 
saturation of the supports. 

3) Dispersion of the oils and fats in the form of fine droplets, in a diluted enzymatic 
solution, preferably containing 0.2 to 4% in volume of an enzyme of the invention. 
This technique is described, e.g., in Belgian patent No. 595.219. A cylindrical 
column with a height of several meters, with conical Ud, is filled with a diluted 
enzymatic solution. For this purpose, a solvent that is non-toxic and non-miscible in 
the oil or fet to be processed, preferably water, is chosen. The bottom of the column 
is equipped with a distribution system in which the oil or fat is continuously injected 
in an extremely divided form (approximately 10,000 flux per m^). Thus an infinite 
number of droplets of oil or fat are formed, which slowly rise in the solution of 
enzymes and meet at the surfece, to be evacuated continuously at the top of the 
conical Ud of the reactor. 

Pahn oil can be pre-treated before treatment with an enzyme of the invention. 
For example, about 30 kg of raw palm oil is heated to +50''C. 1% solutions were prepared in 
distilled water with ceUulases and pectinases. 600 g of each of these was added to aqueous 
solutions of the oil under strong agitation for a few minutes. The oil is then kept at +50"'C 
under moderate agitation, for a total reaction time of two hours. Then, temperature is raised to 
+90'»C to deactivate the enzymes and prepare the mixture for filtration and further processing. 
The oil is dried under vacuum and filtered witii a filtering aid. 

The enzymes of the invention can be used in processes as described in EP 
patent EP 0 513 709 B2. For example, the invention provides a process for the reduction of 
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the content process for the reduction of the content of phosphorus-containing components in 
animal and vegetable oils by enzymatic decomposition using a phospholipase of the 
invention. A predemucilaginated animal and vegetable oil with a phosphorus content of 50 to 
250 ppm is agitated with an organic carboxyUc acid and the pH value of the resulting mixture 
set to pH 4 to pH 6, an enzyme solution which contains phospholipase Ai, A2, or B of the 
invention is added to the mixture in a mixing vessel under turbulent stirring and with the 
formation of fine droplets, where an emulsion with 0.5 to 5 % by weight relative to the oil is 
fonmed, said emulsion being conducted throu^ at least one subsequent reaction vessel under 
turbulent motion during a reaction time of 0.1 to 10 hours at temperatures in the range of 20 
to 80" C and where the treated oil, after separation of the aqueous solution, has a phosphorus 

content under 5 ppm. 

The organic refining process is appUcable to both crude and degummed oil. 
The process uses inhne addition of an organic acid under controlled process conditions, in 
conjunction with conventional centrifiigal separation. The water separated naturally fix>m the 
vegetable oil phosphoUpids ("VOP") is recycled and reused. The total water usage can be 
substantially reduced as a result of the Organic Refining Process. 

The phosphoUpases and methods of the invention can also be used in the 

enzymatic treatment of edible oils, as described, e.g., in U.S. Patent No. 6,162,623. hi this 

exemplary methods, the invention provides an amphiphihc enzyme. It can be immobiUzed, 

e.g., by preparing an emulsion containing a continuous hydrophobic phase and a dispersed 

aqueous phase containing the enzyme and a carrier for the enzyme and removing water fix>m 

the dispersed phase untU this phase turns mto soUd enzyme coated particles. The enzyme can 

be a Upase. The immobilized Upase can be used for reactions catalyzed by Upase such as 

interesterification of mono-, di- or triglycerides, de-acidification of a triglyceride oil, or 

removal of phospholipids firom a triglyceride oil when the Hpase is a phosphoUpase. The 

aqueous phase may contain a fermentation Uquid, an edible triglyceride oil may be the 

hydrophobic phase, and carriers include sugars, starch, dextran, water soluble cellulose 

derivatives and fermentation residues. This exemplary method can be used to process 

triglycerides, diglycerides, monoglycerides, glycerol, phosphoUpids or fatty acids, which may 

be in the hydrophobic phase. In one aspect, the process for the removal of phosphoUpids 

fiom triglyceride oil comprising mixing a triglyceride oil containing phosphoUpids with a 

preparation containing a phosphoUpase of the invention; hydrolyzing the phosphoUpids to 

lysophosphoUpid; separating the hydrolyzed phosphoUpids &om the oil, wherein the 

phosphoUpase is an immobilized phosphoUpase. 
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The phospholipases and methods of the invention can also be used in the 
enzymatic treatment of edible oils, as described, e.g., in U.S. Patent No. 6,127,137. This 
exemplary method hydrolyzes both fatty acyl groups in intact phosphohpid. The 
phosphoUpase of the invention used in this methods has no Upase activity and is active at 
very low pH. These properties make it very suitable for use in oil degumming, as enzymatic 
and alkaline hydrolysis (saponification) of the oil can both be suppressed. In one aspect, the 
invention provides a process for hydrolyzmg fatty acyl groups in a phosphohpid or 
lysophospholipid comprising treating the phosphohpid or lysophosphoUpid with the 
phosphoUpase that hydrolyzes both fetty acyl groups in a phosphohpid and is essentially free 
of Upase activity. In one aspect, the phosphoUpase of the invention has a temperature 
optimum at about 50*C, measured at pH 3 to pH 4 for 10 minutes, and a pH optimum of 
about pH 3, measured at 40'*C for about 10 minutes. In one aspect, the phosphohpid or 
lysophosphoUpid comprises lecithin or lysolecithin. hi one aspect, after hydrolyzmg a m^or 
part of the phospholipid, an aqueous phase containing the hydrolyzed phosphohpid is 
separated from the oil. In one aspect, the invention provides a process for removing 
phosphohpid from an edible oil, comprising treating the oil at pH 1 .5 to 3 with a dispersion of 
an aqueous solution of the phosphoUpase of the invention, and separating an aqueous phase 
containing the hydrolyzed phosphohpid from the oil. hi one aspect, the oil is treated to 
remove mucilage prior to the treatinent with the phosphoUpase. hi one aspect, the oil prior to 
the treatinent witii tiie phosphoUpase contams the phosphohpid in an amount corresponding 
to 50 to 250 ppm of phosphorus, hi one aspect, tiie tireatinent witii phosphohpase is done at 
30*'C to 45''C for 1 to 12 hours at a phosphoUpase dosage of 0.1 to 10 mg/l m the presence of 

0.5 to 5% of water. 

The phosphoUpases and methods of the mvention can also be used in the 
enzymatic treatment of edible oils, as described, e.g., in U.S. PatentNo. 6,025,171. hi this 
exemplary methods, enzymes of tiie mvention are immobiUzed by preparing an emulsion 
contaming a continuous hydrophobic phase, such as a triglyceride oil, and a dispersed 
aqueous phase containmg an amphiphiUc enzyme, such as Upase or a phosphoUpase of tiie 
invention, and carrier material tiiat is partly dissolved and partly undissolved in tiie aqueous 
phase, and removmg water from tiie aqueous phase until tiie phase tinns into soUd enzyme 
coated carrier particles. The undissolved part of tiie carrier material may be a material tiiat is 
msoluble m water and oil, or a water soluble material m undissolved form because tiie 
aqueous phase is aheady saturated witii tiie water soluble material. The aqueous phase may 
be formed wifli a crude Upase fermentation Uquid containmg fermentation residues and 
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biomass that can serve as carrier materials. laamobilized lipase is useful for ester re- 
airangement and de-acidification in oils. After a reaction, the immobilized enzyme can be 
regenerated for a subsequent reaction by adding water to obtain partial dissolution of the 
carrier, and with the resultant enzyme and carrier-containing aqueous phase dispersed in a 
hydrophobic phase ev^oradng water to again form enzyme coated carrier particles. 

The phospholipases and methods of the invention can also be used in the 
enzymatic treatment of edible oils, as described, e.g.. in U.S. Patent No. 6,143,545. This 
exemplary method is used for reducing the content of phosphorous containing components in 
an edible oil comprising a high amount of non-hydratable phosphorus content using a 
phosphoUpase of the invention. In one aspect, the method is used to reduce the content of 
phosphorus containing components in an edible oil having a non-hydratable phosphorus 
content of at least 50 ppm measured by pre-treating the edible oil, at eO'C, by addition of a 
solution comprising citric acid monohydrate in water (added water vs. oil equals 4.8% w/w; 
(citric acid) in water phase = 106 mM, in water/oil emulsion = 4.6 mM) for 30 minutes; 
transferring 10 ml of the pre-treated water in oil emulsion to a tube; heating the emulsion in a 
boiling water bath for 30 minutes; centrifuging at 5000 ipm for 10 minutes, transferring about 
8 ml of the upper (oil) phase to a new tube and leaving it to settle for 24 hours; and drawing 2 
g from the upper clear phase for measurement of the non-hydratable phosphorus content 
(ppm) in the edible oil. The method also can comprise contacting an oil at a pH from about 
pH 5 to 8 with an aqueous solution of a phosphoUpase A or B of the invention (e.g., PLAl, 
PLA2, or a PLB), which solution is emulsified m the oil until the phosphorus content of the 
oil is reduced to less than 1 1 ppm, and then separating the aqueous phase from the treated oil. 

The phospholipases and methods of the invention can also be used m the 
enzymatic treatment of edible oils, as described, e.g., in U.S. Patent No. 5,532,163. The 
invention provides processes for the refining of oil and fet by which phosphoUpids m the oil 
and fat to be treated can be decomposed and removed efficiently. In one aspect, the invention 
provides a process for the refining of oil and fat which comprises reacting, in an emulsion, 
the oil and fat with an enzyme of the invention, e.g., an enzyme having an activity to 
decompose glycerol-fatty acid ester bonds in glycerophosphoUpids (e.g., aPLA2 of the 
invention); and another process in which the enzyme-treated oil and fat is washed with water 
or an acidic aqueous solution. In one aspect, the acidic aqueous solution to be used in tiie 
washing step is a solution of at least one acid, e.g., citiic acid, acetic acid, phosphoric acid 
and salts thereof In one aspect, the emulsified condition is formed usmg 30 weight parts or 

more of water per 100 weight parts of the oil and fet. Since oil and fat can be purified 
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wiflxout employing the conventional alkaU refining step, generation of washing waste water 
and industrial waste can be reduced. In addition, the recovery yield of oil is improved 
because loss of neutral oil and fat due to their inclusion in these wastes does not occur in the 
inventive process. In one aspect, the invention provides a process for refining oil and fat 
containing about 100 to 10,000 ppm of phosphoUpids which comprises: reacting, in an 
emulsified condition, said oil and fat with an enzyme of the invention having activity to 
decompose glycerol-fatty acid ester bonds in glycerophosphoUpids. In one aspect, the 
invention provides processes for refining oil and fat containing about 100 to 10.000 ppm of 
phosphoUpids which comprises reacting, in an emulsified condition, oil and fet with an 
enzyme of the invention having activity to decompose glycerol-fetty acid ester bonds in 
glycerophospholipids; and subsequently washing the treated oil and fet with a washing water. 

The phospholipases and methods of the invention can also be used in the 
enzymatic treatment of edible oils, as described, e.g., in U.S. Patent No. 5.264,367. The 
content of phosphorus-containing components and the iron content of an edible vegetable or 
animal oil, such as an oil. e.g., soybean oil, which has been wet-refined to remove mucilage, 
are reduced by enzymatic decomposition by contacting the oil with an aqueous solution of an 
enzyme of the invention, e.g.. aphosphoUpase Al, A2, or B, and then separating the aqueous 
phase from the treated oil. In one aspect, the invention provides an enzymatic method for 
decreasing the content of phosphorus- and iron-containing components in oils, which have 
been refined to remove mucilage. An oil. which has been refined to remove mucilage, can be 
treated with an enzyme of the invention, e.g.. phosphoUpase C, Al, A2, or B. Phosphorus 
contents below 5 ppm and iron contents below 1 ppm can be achieved. The low iron content 
can be advantageous for the stability of the oil. 

The phospholipases and methods of the invention can also be used for 
preparing transesterified oils, as described, e.g., in U.S. Patent No. 5.288.619. Hie invention 
provides methods for enzymatic transesterification for preparing a margarine oil having both 
low trans- acid and low intermediate chain fatty acid content. The method includes the steps 
of providing a transesterification reaction mixture containing a stearic acid source material 
and an edible Uquid vegetable oil, transesterifying the stearic acid source material and the 
vegetable oil using a 1-. 3- positionally specific Upase, and then finally hydrogenating the 
fetty acid mixture to provide a recycle stearic acid source material for a recycUc reaction with 
the vegetable oil. Tlie invention also provides a counter- current method for preparing a 
transesterified oU. The method includes the steps of providing a transesterification reaction 

zone containing a 1-. 3-positionally specific Upase. introducing a vegetable oil into the 
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tramesterification zone, introducing a stearic acid source material, conducting a supercritical 
gas or subcritical Uquefied gas counter- current fluid, carrying out a transesterification 
reaction of the triglyceride stream with the stearic acid or stearic acid monoester stream in the 
reaction zone, withdrawing a transesterified triglyceride margarine oil stream, withdrawing a 
counter-current fluid phase, hydrogenating the transesterified stearic acid or stearic acid 
monoester to provide a hydrogenated recycle stearic acid source material, and introducing the 
hydrogenated recycle stearic acid source material into the reaction zone. 

In one aspect, the highly unsaturated phosphoUpid compound may be 
converted into a triglyceride by appropriate use of a phosphoUpase C of the invention to 
remove the phosphate group in the sn-3 position, foUowed by 1.3 Upase acyl ester synthesis. 
The 2-substituted phosphoUpid may be used as a functional food ingredient directly, or may 
be subsequently selectively hydrolyzed in reactor 160 using an immobilized phosphoUpase C 
of the invention to produce a 1- diglyceride, followed by enzymatic esterification as 
described herein to produce a triglyceride product having a 2-substituted polyunsaturated 

fatty acid component. 

The phosphoUpases and methods of the invention can also be used in a 
vegetable oil enzymatic degumming process as described, e.g., in U.S. Patent No. 6,001,640. 
This method of the invention comprises a degumming step in the production of edible oils. 
Vegetable oils from which hydratable phosphatides have been eUminated by a previous 
aqueous degumming process are freed from non- hydratable phosphatides by enzymatic 
treatment using a phosphoUpase of the invention. The process can be gentle, economical and 
environment-friendly. PhosphoUpases that only hydrolyzelysolecithin, but not lecithin, are 

used in this degumming process. 

In one aspect, to allow the enzyme of the invention to act, both phases, the oil 
phase and the aqueous phase that contain the enzyme, must be intimately mixed. It may not 
be sufficient to merely stir them. Good dispersion of the enzyme in the oil is aided if it is 
dissolved in a small amount of water, e.g., 0.5-5 weight-% (relative to the oil), and emulsified 
in the oil in this form, to form droplets of less than 10 micrometers in diameter (weight 
average). The droplets can be smaller than 1 micrometer. Turbulent stirring can be done 
with radial velocities above 100 cm/sec. The oil also can be circulated in the reactor using an 
external rotary pump. The aqueous phase containing the enzyme can also be finely dispersed 
by means of ultrasound action. A dispersion apparatus can be used. 

The enzymatic reaction probably takes place at the border surface between the 
oil phase and the aqueous phase. It is the goal of all these measures for mixing to create the 
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greatest possible surface for the aqueous phase which contains the enzyme. The addition of 
surfactants increases the microdispersion of the aqueous phase. In some cases, therefore, 
surfectants with HLB values above 9, such as Na-dodecyl sulfate, are added to Ihe enzyme 
solution, as described, e.g., in EP-A 0 513 709. A similar effective method for improving 
emulsification is the addition of lysolecithin. The amounts added can He in the range of 
0.001% to 1%, with reference to flie oil. The temperature during enzyme treatment is not 
critical. Temperatures between lO^C and S(fC can be used, but the latter can only be appUed 
for a short time. In this aspect, aphosphoUpase of the invention having a good temperature 
and/or low pH tolerance is used. AppUcation temperatures of between 30°C and SO^C are 
optimal. The treatment period depends on the temperature and can be kept shorter with an 
increasing temperature. Times of 0.1 to 10 hours, or, 1 to 5 hours are generally sufficient. The 
reaction takes place in a degumming reactor, which can be divided into stages, as described, 
e.g., in DE-A 43 39 556. Therefore continuous operation is possible, along with batch 
operation. The reaction can be carried out in different temperatiire stages. For example, 
incubation can take place for 3 hours at 40''C. then for 1 hour at 60°C. If tiie reaction proceeds 
in stages, this also opens up the possibiUty of adjusting different pH values in the individual 
stages. For example, in the first stage tiie pH of the solution can be adjusted to 7, for 
example, and in a second stage to 2.5, by adding citiic acid. In at least one stage, however, 
tiie pH of flie enzyme solution must be below 4, or, below 3. If tiie pH was subsequently 
adjusted below tiiis level, a deterioration of effect maybe found. Therefore tiie citiic acid can 
be added to the enzyme solution before the latter is mixed mto flie oil. 

After completion of tiie enzyme tireatinent, tiie enzyme solution, togetiier witii 
tiie decomposition products of tiie NHP contained in it, can be separated fix>m tiie oil phase, 
in batches or continuously, e.g., by means of centiifugation. Smce tiie enzymes are 
characterized by a high level of stabiUty and tiie amount of tiie decomposition products 
contained in tiie solution is sUght (tiiey may precipitate as sludge) tiie same aqueous enzyme 
phase can be used several times. There is also tiie possibiUty of freeing tiie enzyme of tiie 
sludge, see, e.g.. DE-A 43 39 556, so tiiat an enzyme solution which is essentially free of 
sludge can be used again. In one aspect of tiiis degumming process, oils which contain less 
tiian 15 ppm phosphorus are obtained. One goal is phosphorus contents of less tiian 10 ppm; 
or, less tiian 5 ppm. Witii phosphorus contents below 10 ppm, furtiier processing of tiie oil 
according to tiie process of distillative de-acidification is easily possible. A number of otiier 
ions, such as magnesium, calcium, zinc, as well as iron, can be removed from flie oil. e.g.. 
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below 0. 1 ppm. Thus, this product possesses ideal prerequisites for good oxidation resistance 

during further processing and storage. 

The phospholipases and methods of the invaition also can also be used for 
reducing the amount of phosphorous-containing components in vegetable and animal oils as 
described, e.g., in EP patent EP 0513709. In this method, the content of phosphorus- 
containing components, especially phosphatides, such as lecilhin, and the iron content in 
vegetable and animal oils, which have previously been deslimed, e.g. soya oil, are reduced by 
enzymatic breakdown using a phosphoUpase Al, A2 or B of the invention. 

The phospholipases and methods of the invention can also be used for refining 
fat or oils as described, e.g., in JP 06306386. The invention provides processes for refining a 
fat or oil comprising a step of converting a phosphoUpid in a fet or an oil into a water-soluble 
phosphoric-group-containing substance and removing this substance. The action of an 
enzyme of the invention (e.g., a PLC) is utilized to convert the phosphoUpid into the 
substance. Thus, it is possible to refine a fat or oil without carrying out an alkaH refining step 
fiom which industrial wastes containing alkaline waste water and a large amount of oil are 
produced. Improvement of yields can be accompUshed because the loss of neutral fet or oil 
fiom escape with the wastes can be reduced to zero. In one aspect, gummy substances are 
converted into water-soluble substances and removed as water-soluble substances by adding 
an enzyme of the invention having a phosphoUpase C activity in the stage of degumming the 
crude oil and conducting enzymatic treatment In one aspect, the phosphoUpase C of the 
invention has an activity that cuts ester bonds of glycerin and phosphoric acid in 
phosphoUpids. If necessary, tiie method can comprise washing the enzyme-treated oil with 
water or an acidic aqueous solution. In one aspect, the enzyme of tiie invention is added to 
and reacted with the crude oU. The amount of phosphoUpase C employed can be 10 to 
10,000 units, or, about 100 to 2,000 units, per 1 kg of cnide oil. 

The phosphoUpases and methods of the invention can also be used for water- 
degumming processes as described, e.g., in Dijkstra, Albert J., et al., Oleagineux, Corps C3ras, 
Lipides (1998), 5(5), 367-370. In this exemplary method, the water-degumming process is 
used for the production of lecithin and for dry degumming processes using a degumming acid 
and bleaching earth. This method may be economically feasible only for oils with a low 
phosphatide content, e.g., palm oil, lauric oUs, etc. For seed oils having ahighNHP-content, 
the acid refining process is used, whereby this process is carried out at tiie oil mill to allow 
gum disposal via tiie meal. In one aspect, this acid refined oil is a possible "poUshing" 

operation to be carried out prior to physical refining. 
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The phospholipases and methods of the invention can also be used for 
degumming processes as described, e.g., in Dijkstra, et al.. Res. Dev. Dep., N.V, 
Vandemoortele Coord. Cent, Izegem, Belg. JAOCS, J. Am. Oil Chem. Soc. (1989), 66:1002- 
1009. In this exemplary method, the total degumming process involves dispersing an acid 
such as H3PO4 or citric acid into soybean oil, allowing a contact tune, and then mixing a base 
such as caustic soda or Na siUcate into the acid-in-oil emulsion. This keeps the degree of 
neutralization low enough to avoid forming soaps, because that would lead to increased oil 
loss. Subsequently, the oil passed to a centrifugal separator where most of the gums are 
removed ftom the oil stream to yield a gum phase with minimal oil content. The oil stream is 
then passed to a second centrifugal separator to remove all remaining gums to yield a dilute 
gum phase, which is recycled. Washing and drying or in-line alkali refining conq>lete the 
process. After the adoption of the total degumming process, in comparison with Ihe classical 
alkaU refining process, an overall yield improvement of about 0.5% is realized. The totally 
degummed oil can be subsequently alkaH refined, bleached aad deodorized, or bleached and 

physically refined. 

The phospholipases and methods of the invention can also be used for the 
removal of nonhydratable phosphoUpids &om a plant oil, e.g., soybean oil, as described, e.g., 
inHvolby, et aL, Sojakagefebr., Copenhagen, Den., J. Amer. Oil Chem. Soc. (1971) 48:503- 
509. In this exemplary method, water-degummed oil is mixed at different fixed pH values 
with buffer solutions with and without Ca**, Mg/Ca-binding reagents, and surfactants. The 
nonhydratable phosphoUpids can be removed in a nonconverted state as a component of 
micelles or of mixed emulsifiers. Furthermore, ihe nonhydratable phosphoUpids are 
removable by conversion into dissociated forms, e.g., by removal of Mg and Ca &om the 
phosphatidates, which can be accompUshed by addulation or by treatment with Mg/Ca- 
complexing or Mg/Ca-precipitating reagents. Removal or chemical conversion of the 
nonhydratable phosphoUpids can result in reduced emulsion formation and in improved 
separation of the deacidified oil firom the emulsion layer and the soapstock. 

The phosphoUpases and methods of the invention can also be used for the 
degumming of vegetable oils as described, e.g., Buchold, et al., Frankfurt/Main, Germany. 
Fett Wissenschafl Technologie (1993), 95(8), 300-304. In this exemplary process of the 
invention for the degumming of edible vegetable oils, aqueous suspensions of an enzyme of 
the invention, e.g., phosphoUpase A2, is used to hydrolyze the fatty acid bound at the sn2 
position of the phosphoUpid, resulting in 1-acyl-lysophosphoUpids which are insoluble in oU 

and tidus more amenable to physical separation. Even the addition of smaU amounts 
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corresponding to about 700 lecitase units/kg oU results in a residual P concentration of less 
than 10 ppm, so that chemical refining is replaceable by physical refining, eliminating the 
necessity for neutraUzation, soapstock spUtting, and wastewater treatment. 

The phosphohpases and methods of the invention can also be used for the 
degummmg of vegetable oils as described, e.g„ by EnzyMax. Dahlke, Klaus. Dept. G-PDO, 
Lurgi 01-Gas. Chemie, GmbH, Frankfurt, Germany. Oleagineux, Corps Gras, Lipides 
(1997), 4(1), 55-57. This exemplary process is a degunraiing process for the physical 
refining of ahnost any kind of oil. By an enzymatic-catalyzed hydrolysis, phosphatides are 
converted to water-soluble lysophosphatides which are separated &om the oil by 
centrifiigation. The residual phosphorus content in the enzymatically degummed oil can be 
as low as 2 ppm P. 

The phosphohpases and methods of the invention can also be used for the 
degumming of vegetable oils as described, e.g., by Cleenewerck, et al., N.V. Vamo Mills, 
Izegem, Belg. Fett Wissenschaft Technologie (1992), 94:317-22; and, Clausen, Kim; Nielsen, 
Mvmk. Novozymes A/S, Den. Dansk Kemi (2002) 83(2):24-27. The phosphohpases and 
methods of the invention can incorporate the pre-refining of vegetable oils with acids as 
described, e.g., by Nilsson-Johansson, et al.. Fats Oils Div., Alfa-Laval Food Eng. AB, 
Tumba, Swed. Fett Wissenschaft Technologie (1988), 90(11), 447-51; and. Munch, Ernst 
W. Cereol Deutschland GmbH, Mannheim, Germany. Editor(s): Wilson, Richard F. 
Proceedings of the World Conference on Oilseed Processing UtiUzation, Cancun. Mexico, 
Nov. 12-17, 2000 (2001), Meeting Date 2000, 17-20. 

The phosphohpases and methods of the invention can also be used for the 
degumming of vegetable oils as described, e.g., by Jerzewska, et al., Inst. Przemyslu 
Miesnego i Tluszczowego, Warsaw, Pol., Tluszcze Jadahie (2001). 36(3/4). 97-1 10. In this 
process of the invention, enzymatic degumming of hydrated low-erucic acid rapeseed oil is 
by use of a phospholipase A2 of the invention. The enzyme can catalyze the hydrolysis of 
fatty acid ester Unkages to the central carbon atom of the glycerol moiety in phospholipids. It 
can hydrolyze non-hydratable phosphoUpids to their correspondmg hydratable lyso- 
conq)ounds. With a nonpurified enzyme preparation, better results can be achieved with the 
addition of 2% preparation for 4 hours (87% P removal). 

Purification of phytosterols from -vegetable oils 

The invention provides methods for purification of phytosterols and 
triterpenes, or plant sterols, from vegetable oils. Phytosterols that can be purified using 
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phosphoUpases and methods of the inventioii include p-sitosterol, campesterol, stigmasterol, 
stigmastanol. p-sitostanol, sitostanol, desmosterol, chalinasterol, poriferasterol, chonasterol 
and btassicastesrol. Plant sterols are important agricultural products for health and nutritional 
industries. Thus, phospholipases and methods of the mvention are used to make emulsifiers 
5 for cosmetic manufacturers and steroidal intermediates and precursors for the production of 
hormone pharmaceuticals. Phospholipases and methods of the invention are used to make 
(e.g., purify) analogs of phytosterols and their esters for use as cholesterol-lowering agents 
with cardiologic health benefits. PhosphoUpases and methods of the invention are used to 
purify plant sterols to reduce serum cholesterol levels by inhibiting cholesterol absorption in 
10 the intestinal lumen. Phospholipases and methods ofthe mvention are used to purify plant 
sterols that have immunomodulating properties at extremely low concentrations, including 
enhanced ceUular response of T lymphocytes and cytotoxic abihfy of natural killer cells 
against a cancer cell line. PhosphoUpases and methods ofthe invention are used to purify 
plant sterols for the treatment of puhnonary tuberculosis, rheumatoid arthritis, management 
15 of HIV-infested patients and inhibition of immune stress, e.g., in marathon runners. 

PhosphoUpases and methods ofthe invention are used to purify sterol 
components present in the sterol fiactions of commodity vegetable oils (e.g., coconut, canola, 
cocoa butter, com, cottonseed. Unseed, oUve, palm, peanut, rice bran, safflower, sesame, 
soybean,.sunflower oils), such as sitosterol (40.2-92.3 %), campesterol (2.6-38.6 %), 
20 stigmasterol (0-3 1 %) and 5-avenasterol (1 .5 -29 %). 

Methods ofthe invention can incorporate isolation of plant-derived sterols in 
oil seeds by solvent extraction with chlorofoim-methanol, hexane, methylene chloride, or 
acetone, followed by saponification and chromatographic purification for obtaining emiched 
total sterols. Alternatively, the plant samples can be extracted by supercritical fluid 
25 extraction with supercritical carbon dioxide to obtain total Upid extracts fix)m which sterols 
can be enriched and isolated. For subsequent characterization and quantification of sterol 
compounds, the crude isolate can be purified and separated by a wide variety of 
chromatographic techniques including column chromatography (CC), gas chromatography, 
tbin-layer chromatogr^hy (TLC). normal phase high-performance Uquid chromatography 
30 (HPLC), reversed-phase HPLC and c^iUary electrochromatography. Of all chromatographic 
isolation and separation techniques, CC and TLC procedures employ the most accessible, 
affordable and suitable for sample clean up, purification. quaUtative assays and preUminary 
estimates of the sterols in test samples. 
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Phytosterols are lost in the vegetable oils lost as byproducts during edible oU 
refining processes. PhosphoUpases and methods of the invention use phytosterols isolated 
from such byproducts to make phytosterol-enriched products isolated firom such byproducts. 
Phytosterol isolation and purification methods of the invention can incorporate oil processing 
industry byproducts and can comprise operations such as molecular distillation, Uquid-Uquid 

extraction and crystallization. 

Methods of the invention can incorporate processes for the extraction of Upids 
to extract phytosterols. For example, methods of the invention can use nonpolar solvents as 
hexane (commonly used to extract most types of vegetable oils) quantitatively to extract fi«e 
phytosterols and phytosteryl fatty-acid esters. Steryl glycosides and fatty-acylated steryl 
glycosides are only partially extracted with hexane, and increasing polarity of the solvent 
gave higher percentage of extraction. One procedure that can be used is the BUgh and Dyer 
chloroform-methanol method for extraction of all sterol Upid classes, including 
phosphoUpids. One exemplary method to both quaHtatively separate and quantitatively 
analyze phytosterol Upid classes comprises injection of the Upid extract into HPLC system. 

Phospholipases and methods of the invention can be used to remove sterols 
from fets and oils, as described, e.g, in U.S. Patent No. 6,303,803. This is a method for 
reducing sterol content of sterol-containing fats and oils. It is an efficient and cost effective 
process based on the affinity of cholesterol and other sterols for amphipathic molecules that 
form hydrophobic, fluid bilayers, such as phosphoUpidbilayers. Aggregates of phospholipids 
are contacted with, for example, a sterol-containing fat or oil in an aqueous environment and 
then mixed. The molecular structure of this aggregated phosphoUpid mixture has a high 
affinity for cholesterol and other sterols, and can selectively remove such molecules &om fats 
and oils. The aqueous separation mixture is mixed for a time sufficient to selectively reduce 
the sterol content of the fat/oil product through partitioning of the sterol into the portion of 
phospholipid aggregates. The sterol-reduced fat or oil is separated firom the aqueous 
separation mixture. Alternatively, the correspondingly sterol-enriched firaction also may be 
isolated from the aqueous separation mixture. These steps can be performed at ambient 
temperatures, costs involved in heating are minimized, as is the possibiUty of thermal 
degradation of the product AdditionaUy, a minunal amount of equipment is required, and 
since all required materials are food grade, the methods require no special precautions 
regarding handling, waste disposal, or contamination of the final product(s). 

PhosphoUpases and methods of the invention can be used to remove sterols 
from fats and oils, as described, e.g., in U.S. Patent No. 5.880.300. PhosphoUpid aggregates 
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are contacted with, for example, a sterol-containing fat or oil in an aqueous eaviionment and 
then mixed. Following adequate mixing, the sterol-reduced fat or oil is separated from the 
aqueous separation mixture. Alternatively, the correspondingly sterol-eraiched phosphoUpid 
also may be isolated from the aqueous separation mixture. Plant (e.g., vegetable) oils contain 
plant sterols (phytosteiols) that also maybe removed using the methods of the present 
invention. This method is appHcable to a fat/oil product at any stage of a commercial 
processing cycle. For example, the process of the invention may be appUed to refined, 
bleached and deodorized oils ("RBD oils'"), or to any stage of processmg prior to attainment 
of RBD status. Although RBD oil may have an altered density compared to pre-RBD oil. the 
processes of Ihe are readily adapted to eiiher RBD or pre-RBD oils, or to various other fat/oil 
products, by variation of phosphoUpid content, phosphoUpid composition, 
phospholipid:water ratios, temperature, pressure, mixing conditions, and separation 

conditions as described below. 

Alternatively, the enzymes and methods of the invention can be used to isolate 
phytosterols or other sterols at intemiediate steps in oil processing. For example, it is known 
that phytosterols are lost during deodorization of plant oils. A sterol-containing distillate 
fraction from, for example, an intermediate stage of processing can be subjected to the sterol- 
extraction procedures described above. This provides a sterol-enriched lecithin or other 
phosphoUpid material that can be further processed in order to recover the extracted sterols. 

Detergent Compositions 

The invention provides detergent compositions comprising one or more 
phosphoUpase of the invention, and methods of making and usmg these compositions. The 
invention incorporates all methods of making and using detergent compositions, see, e.g.. 
U.S. Patent No. 6.413,928; 6,399,561; 6,365.561; 6,380.147. The detergent compositions can 
be a one and two part aqueous composition, a non-aqueous Uquid composition, a cast soUd. a 
granular form, a particulate form, a compressed tablet, a gel and/or a paste and a slurry form. 
Hie invention also provides methods enable of a rapid removal of gross food soils, fihns of 
food residue and other minor food compositions using these detergent compositions. 
PhosphoUpases of the mvention can fadUtate the removal of stains by means of catalytic 
hydrolysis of phosphoUpids. PhosphoUpases of the invention can be used in dishwashing 
detergents in textile laundering detergents. 

The actual active enzyme content depends upon the method of manufacture of 
a detergent composition and is not critical, assuming the detergent solution has the desired 
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enzymatic activity. In one aspect, the amoiint of phosphoUpase present in the final solution 
ranges firom about 0.001 mg to 0.5 mg per gram of the detergent composition. The particular 
enzyme chosen for use in the process and products of this invention depends upon the 
conditions of final utiUty, including the physical product form, use pH, use temperature, and 
soil types to be degraded or altered. The enzyme can be chosen to provide optimum activity 
and stabiUty for any given set of utiUty conditions. In one aspect, the polypeptides of the 
present invention are active in the pH ranges of fix)m about 4 to about 12 and in the 
temperature range of ftom about 2(fC to about PS'^C. The detergents of the invention can 
comprise cationic, semi-polar nonionic or zwitterionic surfectants; or, mixtures thereof. 

Phospholipases of the present invention can be formulated into powdered and 
Uquid detergents having pH between 4.0 and 12.0 at levels of about 0.01 to about 5% 
(preferably 0.1% to 0.5%) by weight. These detergent compositions can also include other 
enzymes such as known proteases, cellulases. Upases or endoglycosidases, as well as builders 
and stabilizers. The addition of phosphoUpases of the invention to conventional cleaning 
compositions does not create any special use limitation. In other words, any temperature and 
pH suitable for the detergent is also suitable for the present compositions as long as the pH is 
within the above range, and the temperature is below the described enzyme's denaturing 
temperature. In addition, the polypeptides of the invention can be used in a cleaning 
composition without detergents, again either alone or in combination with builders and 
stabilizers. 

The present invention provides cleaning compositions including detergent 
compositions for cleaning hard surfaces, detergent compositions for cleaning fabrics, 
dishwashing compositions, oral cleaning compositions, denture cleaning compositions, and 

contact lens cleaning solutions. 

In one aspect, the invention provides a method for washing an object 
comprising contacting tiie object with a phosphoUpase of tiie invention under conditions 
sufficient for washing. A phospholipase of the invention may be included as a detergent 
additive. The detergent composition of tiie invention may, for example, be formulated as a 
hand or machine laundry detergent composition comprising a phosphoUpase of tiie invention. 
A laundry additive suitable forpre-treatment of stained febrics can comprise a phosphoUpase 
of the invention. A fabric softener conqMJsition can comprise a phosphoUpase of the 
invention. Alternatively, a phosphoUpase of tiie invention can be formulated as a detergent 
composition for use in general household hard surface cleaning operations. In alternative 
aspects, detergent additives and detergent compositions of tiie invention may comprise one or 



09010-094001 

WO 03/089620 P C T/' U S O 3PCT/US03/12556 ■ 

more other enzymes such as a protease, a lipase, a cutinase, another phosphoUpase, a 
caibohydrase, a ceUulase, a pectinase, a mannanase, an arabinase, a galactanase, a xylanase, 
an oxidase, e.g.. a lactase, and/or a peroxidase. The properties of the enzyme(s) of the 
invention are chosen to be compatible with the selected detergent (i.e. pH-optimum, 
compatibiUty with other enzymatic and non-enzymatic ingredients, etc.) and the enzyme(s) is 
present in effective amounts. In one aspect, phosphoUpase enzymes of the invention are used 
to remove malodorous materials from fabrics. Various detergent compositions and methods 
for making them that can be used in practicing the invention are described in, e.g., U.S. 
Patent Nos. 6.333.301; 6,329.333; 6.326,341; 6.297.038; 6.309.871; 6,204.232; 6,197,070; 
5,856,164. 

Waste treatment 

The phospholipases of the invention can be used in waste treatment. In one 
aspect, the invention provides a soUd waste digestion process using phospholipases of the 
invention. The methods can comprise reducing the mass and volume of substantially 
untreated soUd waste. SoUd waste can be treated with an enzymatic digestive process in the 
presence of an enzymatic solution (including phosphoUpases of the invention) at a controlled 
tenqjterature. The soM waste can be converted into a Uquefied waste and any residual soUd 
waste. The resulting Uquefied waste can be separated ftom said any residual soUdified waste. 
See e.g., U.S. Patent No. 5,709,796. 

Other uses for the phospholipases of the invention 

The phosphoUpases of the invention can also be used to study the 
phosphoinositide (PI) signaUng system; in the diagnosis, prognosis and development of 
treatments for bipolar disorders (see, e.g., Pandey (2002) Neuropsychopharmacology 26:216- 
228); as antioxidants; as modified phosphoUpids; as foaming and gelation agents; to generate 
angiogenic Upids for vascularizing tissues; to identify phosphoUpase, e.g, PLA, PLB, PLC, 
PLD and/or patatin modulators (agonists or antagonists), e.g., inhibitors for use as anti- 
neoplastics, anti-iBflammatory and as analgesic agents. They can be used to generate acidic 
phosphoUpids for controUing the bitter taste in food and pharmaceuticals. They can be used 
in fat purification. They can be used to identify peptides inhibitors for the ti^atment of viral, 
inflammatory, aUergic and cardiovascular diseases. They can be used to make vaccines. 
They can be used to make polyunsaturated fatty acid glycerides and phosphatidylglyceiols. 

The phosphoUpases of tiie invention, for example PIA and PLC enzymes, are 
used to generate immunotoxins and various tiierapeutics for anti-cancer treatinents. 
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The phospholipases of the invention can be used in conjunction witii other 
enzymes for decoloring (i.e. chlorophyU removal) and in detergents (see above), e.g., in 
conjunction witii other enzymes (e.g., hpases, proteases, esterases, phosphatases). For 
example, in any instance where a PLC is used, a PLD and a phosphatase may be used in 
conibination, to produce the same result as a PLC alone. 

The invention will be further described with reference to the foUowing 
examples; however, it is to be understood tiiat tiie invention is not limited to such examples. 



EXAMPLES 

PVAMPLE 1: pT.ASTPROGRA^V/rTTSEDFORCT ^nTmNCEroE^^WPR0FImG 

This example describes an exemplary sequence identity program to determine 
ifanucleicacidiswithintiiescopeoftiieinvention. An NCBI BLAST 2.2.2 program is 
used, default options to blastp. All default values were used except for the default filtering 
setting (i.e., all parameters set to defeult exc^t filtering which is set to OFF); in its place a 
F F" setting is used, which disables filtering. Use of default filtering often results in KarUn- 
Altschul violations due to short length of sequence. The defeult values used in this example: 

"Filter for low complexity: ON 

> Word Size: 3 

> Matrix: Blosum62 

> Gap Costs: Existence: 11 

> Extension:!" 

Other default settings were: filter for low complexity OFF, word size of 3 for 
protem, BLOSXJM62 matrix, gap existence penalty of -11 and a gap extension penalty of -1. 
The "-W" option was set to defeult to 0. This means tiiat, if not set, Ihe word size defaults to 
3 for proteins and 11 for nucleotides. The settings read: 
«EUBADME.bls.txt» 

> 

>blastall arguments: 
> 

> -p Program Name [String] 

> -d Database [String] 

> default = nr 

> -i Query File [File In] 

> defeult = stdin 

> -e Expectation value (E) [Real] 

> default =10.0 

> -m aligmnent view options: 

> 0 = pairwise, 
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> 1 = quCTy-anchored showing identities, 

> 2 = query-anchored no identities, 

> 3 = flat query-anchored, show identities, 

> 4 = flat query-anchored, no identities, 

> 5 = query-anchored no identities andbhint ends, 

> 6 = flat query-anchored, no identities and bhint ends, 

> 7 = XML Blast output, 

> 8 = tabular, 

> 9 tabular with comment lines [Integer] 

> default = 0 

> -o BLAST report Output File [File Out] Optional 

> default = stdout 



> 



F Filter query sequence (DUST with blastn, SEG with others) [String] 



> default = T 



•G Cost to open a gap (zero invokes default behavior) [Integer] 



> 

> default = 0 , ^ . s tt * i 

> -E Cost to extend a gap (zero invokes defeult behavior) [Jntegerj 

> default = 0 ^ ^ . , , 

> -X X dropoff value for gapped alignment (in bits) (zero mvokes defeult 

> behavior) [Integer] 

> default = 0 

> -I ShowGrsindeflines[T/F] 



> default = F 



> 



q Penalty for a nucleotide mismatch (blastn only) [Integer] 



> default = -3 



> 



-r Reward for a nucleotide match (blastn only) [Integer] 



> default =1 ^ . . f. /TA 

> -v Number of database sequences to show one-hne descriptions tor (V) 

> [Integer] 

> default = 500 ^ rr * i 
-b Number of database sequence to show alignments for (B) [Integerj 



> 



> default = 250 



> 



-f Threshold for extending hits, default if zero [Integer] 



> defeult = 0 



> 



g Perform gapped alignment (not available with tblastx) [T/F] 



> default = T 



> 



-Q Query Genetic code to use [Integer] 



> default =1 



> 



-D DB Genetic code (for tblast[nx] only) [Integer] 



> default = 1 

> -a Number of processors to use [Integer] 

> default = 1 

> -O SeqAlign file [File Out] Optional 

> -J Believe the query defline [T/F] 

> defeult = F 

> -M Matrix [String] 

> default = BLOSUM62 

> -W Word size, default if zero [Mega:] 

> defavat = 0 

> -z Effective length of the database (use zero for the real size) 
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> [String] 

> default = 0 

> -K Number of best hits from a region to keep (off by default, if used a 

> value of 100 is recommended) [Meger] 

> default = 0 

> -P 0 for multiple hits 1-pass, 1 for single hit 1-pass, 2 for 2-pass 

> [Integer] 

> default = 0 

> -Y Effective length of the search space (use zero for the real size) 
>[Real] 

> default = 0 

> -S Query strands to search against database (for blast[nx], and 

> Iblastjc). 3 is both, 1 is top, 2 is bottom [Integer] 

> default = 3 

> -T Produce HTML output [T/F] 

> default = F 

> -1 Restrict search ofdatabase to list of GI's [String] Optional 

> -U Use lower case filtering of FASTA sequence [T/F] Optional 

> default = F 

> -y Dropoflf (30 forblast extensions inbits (0.0 invokes default 

> behavior) [Real] 

> default = 0.0 

> -Z X dropoff value for final gapped aligmnent (in bits) [Integer] 

> default = 0 

> -R PSI-TBLASTN checkpoint file [File In] Optional 

> -n MegaBlast search [T/F] 

> default = F 

> -L Location on query sequence [String] Optional 

> -A Multiple Hits window size (zero for single hit algorithm) [Integer] 

> default = 40 



RXAMPLE 2: STMT JLATION OF PLC MEDIATED D EGUMMING 

This example describes the simulation of phospholipase C (PLC)-mediated 

degumming. 

Due to its poor solubility in water phosphatidylcholine (PC) was originally 
dissolved in ethanol (100 mg/ml). For initial testing, a stock solution of PC in 50 mM 3- 
morphoUnopropanesulphoUc acid or 60 mM citric acid/NaOH at pH 6 was prepared. The PC 
stock solution (lOjil, l^lg/^ll) was added to 500 ^1 of refined soybean oil (2% water) in an 
Eppendorf tube. To generate an emulsion the content of the tube was mixed for 3 min by 
vortexing (see Fig. 5A). The oil and the water phase were separated by centrifiigation for 1 
min at 13,000 rpm (Fig. 5B). The reaction tubes were pre-incubated at the desired 
temperature (37®C, 50°C, or 60°C) and 3 pi of PLC from Bacillus cereus (0.9 U/jil) were 
added to the water phase (Fig. 5C). The disappearance of PC was analyzed by TLC using 
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chloroform/ methanoVwater (65:25:4) as a solvent system (see, e.g., Taguchi (1975) supra) 
and was visiialized after exposure to h vapor. 

Figure 5 schematically illustrates a model two-phase system for simulation of 
PLC-mediated degumming. Fig. 5 A: Generation of emulsion by mixing crude oil with 2% 
water to hydrate the contaminating phosphatides (P). Fig. 5B: The oil and water phases are 
separated after centrifugation and PLC is added to the water phase, which contains the 
precipitated phosphatides ("gums"). The PLC hydrolysis takes place in the water phase. Fig. 
5C: The time course of the reaction is monitored by withdrawing aliquots from the water 
phase and analyzing them by TLC. 
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